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LETTER OF TRANSMITTAL 


Department of the Interior, 

Office of Education, 
Wa^hmgton, D. C., May, 19S2. 

Sir; About two years ago it occurred to me to attempt a conference 
on research in higher education. Accordingly, I discussed the matter 
with Dr. Arnold Bennett Hall, President of the University of Oregon 
and we decided to hold a preliminary conference on this subject in 
April, 1931. Men were gatheted from the entire Northwest for a 
2-day conference on the subject. Some of the papers were of consider- 
able value in opening u]^ the subject^ These papers werr brought 
together and disposed of in some four ways. Here are the ones 
which we consider worthy of printing as beginning our work in re- 
search in higher education. I recommend that feiese be printed as a 
bulletin of the Office of Education. 

Respectfully submitted. 

Wm. John Cooper, 

Commissioner. 

The Secretary of the Interior. 
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FOREWORD 

The applicati(in of scientific method to the study of college problems 
in the fields of curricula, methods, administration, and student person- 
nel is increasing rapidly. The e.xperimental method — the setting 
up of alternative procedures in such a way that the factor under 
study may be isolated for measurement — appeals more to college 
faculty members than other methods of investigation. It is the 
method to which many of them are accustomed in ^thcir academic 
fields. It is the method which allows for greatest objectivity, and 
hence carritv greatest weight. 

In higher education, there are many long-established practices and. 
well-settled convictions. These may be molested only in the light of 
definitely authenticated facts. Faculty disciLssions tend to be fruit- 
less because of the absence of such authentication. If the present 
general imrest in the colleges is to result in wise changes, there must 
be widespread experimentation >^ith the problems involved. 

Recognizing the fundamental importance of this experimentation, 
the Division of Colleges and Professional Schools of the United States 
Office of Education hopes to shape its program so as to be of as great 
assistance as possible in stimulating experimentation in the universities 
and colleges. The conference in the Pa'cific Northwest at which the 
papers published in this bulletin were read, was the first of what it is 
hoped may be a series of regional conferences where results of experi- 
mentation may be reported and discussed. The present bulletin is 
the first of what it is hoped may be a series of bulletins to be published 
by the United States Office of Education, making available the results 
of experimental studies in higfier education. < 

The facilities of the library in the Office of Education and the serv- 
ices of tlie' staff in the Division of Colleges and Professional Schools 
will be available to aid in forwarding this program of experimentation. 

Frbd J. Ekllt, 

Chiej, Division of Colleges and Professional Schools. 
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RESEARCH- IN HIGHER EDUCATION 

INTRODUCTION 

Henhy Davidson Sheldon 
Dean of the School of Education, Unibertify of Oregon 

The papers here published were read in connection with the pro- 
gram of the First Conference on Higher Education for the Pacific 
Northwest, held at Eugene, Oreg., April 14 to 16, 1931, under the 
joint auspices of the United States Office of Education and the 
University of Oregon. Those prepared by members of the University 
of Oregon staff, together with an earlier collection by the university 
under the title of “Controlled Experimentation in the Study of 
Methods of College Teaching,” constitute the permanent results of a 
5-year experimental program for the Improvement of College Teaching 
in -the University of Oregon. The fundamental idea back of this 
program has been not so much the production of permanent contri- 
butions to the scientific literature of this field, although it is hoped 
that this may be a by-product, but rather the stimulating of interest 
on the part of the teaching staff of the university in a more effective 
point of approach in handling the teaching situation. Consequently, 
at this time there may be a certain value in endeavoring to summarize 
the results, as far as they have been achieved, from this point of view. 

The first problem attacked in this program at the University of 
Oregon was the problem of securing greater initiative and more alert 
intelligence on the part of the students in college courses. The criti- 
cism heard frequently that the prevailing modes of instruction in 
college, whether based on lectures, textbooks, (s library reading, were 
largely passive and that the student is engaged in absorption rather 
than discovery, was felt to have much weight. The protests against 
this condition in various fields of education are well known. For the 
preschool age, Madam Montessori; for the elementary and junior 
high school, the problem-project method of approach; in science 
instruction, the laboratory method, have all been designed to improve 
this situation. In the field of higher and professional education, the 
law schools of the coimtry, through the employment of the case 
method of instruction, seem to have made the most distinct progress in 
this field. In the University of Oregon, several departments under- 
took to organize experiments which should endeavor to show the value 
of a set-up involving my re initiativow The departments in question 
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were psychology, economics, sociology, and education. Some of the 
results of these experiments have been printed in the earlier volume. 
The experimentation has been continued but without material addition 
to the results there published. Apparently the case problem-project 
methodv secures more response and better results from the able 
student? without materially lessening the achievement of those 
belonging to the inferior and mediocre groups. More than this we 
can not say until it has been possible to work out through several 
years of special appliance of a variety of techniques on a college 
level for this new type of work, something which the law schools and 
schools of business administration have already done. 

In a period when the limitation of numbers has become, because of 
the expense of higher education, a very pertinent problem to college 
administrations, the question of means of selecting students naturally 
comes to the forefront and it becomes important to discover to what 
extent it is possible to determine in advance the schq^astic future of 
candidates. The University of Oregon has worked on this^roblem 
from two points of view. The establishment of intelligence test^or all 
entering students during the last five years has furnished matenal for 
the study of the value of these tests as a means of prediction. 
Dr. Howard Taylor, who has had charge of this work, has organized 
and interpreted this material in one of the papers in the present collec- 
tion. Considerable attention has also been g^ven to the second 
aspect — that of special aptitude tests. Here two tests, one in account- 
ing by Professor Stillman and one in English by Mr. Shumaker, have 
been developed to a point of usefulness beyond previous contributions 
in this field. Beginnings have been made in two other fields, jour- 
nalism and teaching. Experience has shown that success here depends 
on a very large amount of empirical experimentation and can only be 
secured at great expense of time and energy through several years. 

The reliability of grades and the improvement of current exanqiia- 
tion systems constitute another pro^nce for investigation. While it 
has been the purpose of the committ^rip charge to encourage a much 
larger use of objective examinations than has hitherto been the case; 
it was also felt that much might be done in the improvement of the 
so-called essay type of examinations. Th® cooperation of the 
university administration in asking all departments to file copies of 
each examination has placed a large reservoir of material at the 
disposition of the committee. Two papers in the present collection 
by Cliflford Constance and Ralph Leighton represent the contribution 
here. ^ 

The university has also cooperated in the movement, general 
throtighout the country, to assist freshmen and other beginning, 
students in study procedures thMU^h reading tests, schemes for the 
organization of time, use of libraVy, etc. The number of students 
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exposed to this procedure is not lai^e but the results which are given 
in a paper by Messrs, Tuttle and Davis are reassuring. Other 
aspects of this problem are dealt with by the papers of Professor 
Parr, of Oregon State College, and Professor Jordan, of ,§tate Teach- 
ers College, Ellensburg, Wash. To secure further results and values 
^here, there should be a much larger number of cases which should 
be studied individually and segregated by types. The fact that 
many colleges and universities are laboring in this field suggests the 
possibilities of cooperative institutional research. 

In addition to the lines of endeavor already mentioned, there have 
been certain miscellaneous experiments such as the one in the field 
of English history on-^-he value of formal quizzes, published in the 
early collection by Dr. Donald Barnes. These studies are being 
continued and certainly point to the necessity of' certain changes in 
procedure. 

From the pmint of view of understanding present conditions in the 
University of Oregon, there has been begun a series of investigations' 
of a collective and local sort not represented in the present volume, 
because of the late hour of completion. In an institution of large ^ 
size the actual results of the teaching procedures are difficult to 
determine and student and faculty gossip concerning them is by no 
means trustworthy. Consequently, the committee in charge of 
investigating college teaching has organized a number of studies to 
determine the results of certain devices in teaching The procedures 
taken up in this way have been term papers, investigated by a 
special committee of which Prof. L. L. Lewis, of the ^nglisb depart- 
ment, was chairman, and the actual amoimt of time sp^nt in assigned 
readings in book reserve by Prof. Virgil Earl. Certain departments 
of the university depend on this method exclusively for students’ 
study ' The value of segregation according to ability in certain 
-sections has also been studied and reported on. 

The investigations of college teaching at the University of Oregon 
have been entirely the work of a large faculty committee of 12 mem- ' 
bers representing many of the important schools and departments of 
the university. At first this committee was made up largely of 
departihent heads and deans of departments. It was discovered, 
however, in course of time that, after the general policy had been 
outlined, ^more fruitful results could be obtained by a membership 
made up in the main of younger men in the departments who have 
the tim^ and leisure to undertake studies themselves. Consequently 
the committee is more and more becoming a clearing house for the 
discussion of technique along the lines represented. 

The university administration has assisted by wholeheartedly sup^. . .. 
porting the program of the conunittee and also by furnishing' small 
■uips of money, which are Decenary for mimeographing certain mate- 
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rial, printing tests, and for the mechanical and statistical manipula- 
tion of the results ^f the experiment. The institution has also sup- 
plied technical statistical advisers for consultation purposes. These 
two forms of practical aid are undoubtedly necessary if a program 
of this sort is to be carried out effectively. 

In diffusing the results of the investigations, of conferences, and 
visits to other institutions, the committee has adopted the plan of 
holding special meetings of the faculty known as “colloquia” where 
the results are presented and discussed. Some of the meetings have 
been largely attended and have constituted a valuable stimulus. 
As the program has proceeded, the necessity for these formal meet- 
ings has largely disappeared as it is found that the spontaneous 
interest of each group irf* a particular experiment creates a more 
genuine reaction than any formal meeting could secure. 

The committee has also endeavored’ to post the faculty through the 
publication of mimeographed bibliographies, special book reviews, 
and through the establishment of a special shelf in the library with 
duplicate copies of important books and periodicals on college teach- 
ing. The personnel office under the direction of Dr. Howard Taylor 
has also assisted by sending out mimeographed studies of certain 
local problems based on material coH,ected. 

*'Tne results of a campaign of this sort are somewhat difficult to 
estimate. Undoubtedly a more analytic attitude toward the entire 
subject of college teaching has resulted with a majority of the faculty. 
Two- thirds of the departments and schools of the University have 
at one time or another participated in certain experimental activities 
and at Jeast one-third have carried on somewhat systematic experi- 
mental programs. This indirect method of approach, depending on 
tjie voluntary activity of members of the faculty, is undoubtedly 
slower in securing results than certain other more direct campaigns 
involving the teaching staff in compulsory systematic activities but 
it is felt that it avoids a large amount of antagonism and probably 
in the long run is more pervasive in changing the attitude of the 
instructing staff. 
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GROUP I— INSTRUMENTS OP MEASUREMENT IN 
THE FIELD OF COLLEGE INSTRUCTION 

HOW RELIABLK ARE COLLEGE MARKS?' 

Howard R. Taylor* * and Clifford L. Constanck* 

If college marks w:ere merely rewards for the more or less faithful 
performance of academic tasks, they would probably not merit 
scientific study. Since, in pjeneral, marks have become increasingly 
important as a basis for administrative and educational procedures — 
in American colleges at any rate — some determination of the credence 
which they deserve is likewise increasingly important. Faculties and 
administrators- usually put their requirements of students in terms 
of marks. Students are dismissed, prevented from transferring to 
other institutions, ruled ineligible for athletic competition, awarded 
fellowships, accorded privileges such as working fob honlors, and so 
on — all on the basis of inatks. Again, studies of learning suggest that 
a more less objective knowledge of results, such as marks attempt to 
provide for college instruction, is a positive factor in improvement. 
The selective function of colleges — a seldom recognized but per- 
haps major service of such institutions — is primarily dependent on 
the marking of students. Moreover, although our whole sociajl 
organization is based upon individual differences in general capacity^ 
and special aptitude, school marks provide almost the only organized.1 
attempt to give individuals the information about themselves in 
comparison with o.thers w'hich is indispensable if ambitions are to bei 
brought in line with actualities. Evidence that college marks per-J 
form such a selective guidance function with at least partial validity 
is plentiful. President Gifford (3)* of the Bell Telephone Co. has 

‘ The writflr of thia paper is Mr. Taylor, but the study was made by Mr. Constattce under the writer*! 
direction as a master's thesis in psychology at the university. 

•Howard R. Taylor, director. Personnel Research Bureau, University of Oregon. A. B., PaclHc Unl- 
-vsrslty. 1914; A. M., Stanford University, 1923, Ph. D., 1927. Publications: “The N^eed for Personnel 
Researchln a University,*' School and Sode/y, 20 :673, Xov. 19, 1927; with F. F. Powers, “ Bible Study and 
Character," Pedagogical Seminar and Journal of Genetic Peychology, 294-302, June. 1928; **Tbe Influence of 
tl»e Teacher on Relative Class Standing in Arthmetic Fundamentals and Reading Comprehension,*^ 
Twenip-iioenth Yearbook of the National Sodeif for the Studp of Education, Part II, chapter 6, 97-110, 1928; 
“An Eiperlment with Independent Study,’* Controlled Eiperi mentation in the Studg of Afethodi of CoUega 
I'oacMng, University of Oregon Publication, Education Series, 1:7:300-313, February, 1929; “Teacher 
Induenceon Class Achievement: A Study of the Relationship of Estimated Teaching Ability to PupU 
Achievement In Reading and Arithmetic,” Genetic Pegchologg Monograrks, 7:2, February. 1930. 

* Clifford L. Constance, assistant registrar, University of Oregon. B. A., University of Oregon, 1935, 
M. A., 11^. Publication: “Greeks of the Campus,** School and Society, 30:40(MI4. 1929; '‘Personality 
Bating! Qivan Hlgb-Sobooi Graduates by Principals and Teachers,” School lutiew, 39:583-688, 1031. 

• Nognbert in ptrsotbeaes refer to *' Bibliography,” p. 14. 
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sho^ why his organization uses marks made in college work as a 
basis for employment— high* marks tend to indicate the abler men. 
In college the guidance of students by marks into fields of study where 
their interests and Ambitions harmonize with their abilities occurs 
continuously on both the conscious and unconscious level. 

Hence, it would seem that these varied uses of college marks are 
necessary and reasonable enough if only the marks are dependable. 
In fact, most of the objections to marking students disappear in pro- 
portion fo the reliability and validity of such judgments. Even the 
artificiality of mark^ is unimportant if only they measure accurately 
what they are supposed to measure. \Yhile the validation of college 
marks, i. e., the social and individual significance of excellence in the 
various fields of college instruction, is perhaps even more important 
than the matter of reliability, this study w'ill be confined to the latter 
Buxipler but still very important issue. 

When a surveyor measures off a city lot, he as^mes reasonably 
enough (c) that his tape is equivalent to any other gdqd tape within 
negligible limits; (f>) that his errors in reading it are negligible, or at 
least that they can be made so by repeated measurements; (c) that 
the ^ea and shape of the plot will remain constant year after year, 
or aHeast practically so. When psychologists attempt to apply 
scientific techniques to such measures as college marks, the extent to 
which ^alogous sources of error are negligible must be empirically 
determined. 

Wh^n an instructor appraises student achievement, (a)‘it is seldom 
or never true that he uses exactly the same standards as other equally 
competent instructors. Nor can he apply the same examination 
again and again in the way a surveyor uses a steel tape. Hence, 
various scholastic tape measures are fearfully and wonderfully differ- 
ent one from another. (6) Seldom will two instructors when evaluating 
the same sample of student performance agree as to its merit. Even 
two successive appraisals of the same paper by the same instructor 
are likely to differ. Thus instructors are also unable to read very 
pr^isely such scholastic scales as they do have. Even in supposc^dly 
objective subjects, such as mathematics and science, disagreements in 
appraising performance are laige, as studies by Starch, Ruch, Wood, 
Md others have repeatedly demonstrated (7, 9, 11). The remedy as 
in all scientific measurement is to make repfited measurements so 
.that succe^ive errors of a chance sort will tend to cancel. Thus the 
/sum of many relatively crude measurements may be decidedly better 
than the separate estimatee of wRich it is composed, (c) Further- 
more, the abilities and especially the performances of students are not 
the same yesterday, to-day, and forever. Students are affected by 
health, happiness, and countless other things so that a measure of a 
student's true achievement would necessitate inany evaluations under 
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all reasonable conditions over a fairly long period of time. The 
average of such a series of measures might be considered the student's 
true ability. Especially during instruction with fluctuating effort, 'to 
^'hich college students are prone, it is reasonable to expect consider- 
able change both absolute and relative in student achievement from 
week to week and year to year. \ 

Fortunately it is the relative size only of errors which has significance. i 
An error of an ounce or two — even a few pounds — is usually negli- 
gible in w’eiglung human adults because the diffe/ences between indi- i 
viduals are, in pnera^.much greater than an error of this degree. 

If one were weighing mosquitoes, finer scales w’ould be necessary, 
because the heaviest mosquito weighs only a small fraction of an 
ounce more than the lightest. Since in the case of college marks (a) 
inequalities in standards and errors in sampling scholastic perform- 
anc.e, (6) errors in evaluating these samples of student performance, 
and^ (c) fluctuations in the performance of individual students are 
all inherent in the situations where the marks are to be used, it is 
the unreliability of marks as a result of all these sources of error that 
we wish to determine. 

Now, if two similar measurements of the scholastic achievement of 
each student are made independently at reasonable intervals, and if 
each student tends^ to hold about the same rank in comparison with 
others, we may .cdncludo that the errors from ail these sources are 
small in TOrapari^tf with the total range of individual differences in 
such achievement. Since' the correlation coefficient expressM^ very 
compactly the extent to which one measurement of a group of indi- 
\’iduals will predict another in linear fashion, it is customary to con- 
sider the correlation of jwo similar measurements made independ- 
ently under representati'^ conditions an index of the accuracy of 
the measurements. ' 

In studying the reliability of college marks the practice has been 
to correlate successive quarters, semesters, or years of college work. 

But students often do. not take closely similar programs of work, even 
in successive quarters and semesters, and certainly not in succUsive 
college years. Thus the kind of performance sampled when the 
excellence of work in two different terms is correlated, theoretically 
at least, is not sufficiently similar k> indicate the real accuracy of the 
measureinents. In so far .as samples of dissimilar scholastio perform- 
ance are involved, such comijtl^^B give reliability coefficients which 
are too low. On the other hand, instructors are seldom able to make 
completely independent estimates of the merit of performance in 
successive terms of the course and students al» likely to be rated 
twice on much the same basis when scholastic ^rformance for two 
such terms is correlated. Then the comparisons are Uk^Iy to give 
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spuriously high reliability coefficients because of correlated errors in 
the two series of estimates. 

Finally, we can hardly e.xpect the distractions of college life and 
the accompanying individual fluctuations in interest and effort to 
affect all students alike. Football affects the gridiron warriors and 
their partisans most in the fall quarter, hor the basket ball men 
and their following whatever effects there are come chiefly in the 
winter term, while canoeists and poets are probably most susceptible 
to “spring fever” at that time of year. Thus correlations between 
quarters and semesters of college work furnish reliability coefficients 
which are too low because the fluctuations in individual performance 
for the periods conapared are not typical. 

In our study we have attempted to minimize these three (in part 
counteracting) sources of unrepresentative error in the assignment 
of college marks. We began A^ith the records of 418 men and 403 
women who entered as freshmen and completed the fall term of 
1925-26 at the Umversity of Oregon. Of these, 184 men and 212 
women remained for two consecutive years, 1925—26 and 1926-27, 
Their scholastic records in these sLx quarters of lower division w'ork 
furnished the basic data for tjhe study. Thus we had six measure- 
ments — marks evaluating the scholastic work of six consecutive quar- 
ters — for each individual. These marks were then segregated by 
departments as well as by quarters for each student. We then took 
marks earned in the fall of 1925, spring of 1926, and A^inter of 1927 
(the odd quarters of our six) and compared them Avith marks earned 
in the winter of 1926, fall of 1926, and spring 6i 1927 (the even quar- 
ters). We then paired these marks by departments, so that In each 
set of measures the kind of scholastic performance rated was matched 
in the other set, giving us two representative samples of the scholastic 
performance of each student for each department in which he took 
courses. Thus the kinds of work attempted are reasonably similar. 
The estimates have been made by different instructors, or at least 
separated by a 3-month interval of time, so as to favor independence 
in the judgments. Work done in similar quarters of different years 
has been utilized to make fluctiiations in individual performance 
comparable. Wherever a student had two terms of work in a given 
-department both of which happened to fall in the same set of ^erms, 
i. e., both in the odd or both in the even set, they of course could not 
be used. This occurred so seldom that only 276 out of more than 
11,000 separate grades had to be discarded in the study. 

Marks at the University of Oregon ‘ are on a 6-step scale, as foUoAVs: 
I, unusual excellence j II, high quality; III, satisfactory; IV, fair; 
V, passing; VI or F, failure. If marks III and IV be combined as 
average, the scale is practically the same as the more widely used 

_ , . — , — ... / L — , 
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5^tep scale. The university catalogue states the expected propor- 
tions of the average class to which these various grades will be assigned 
as I and II, 20-25 per cent; III and IV, 55-65 per cent; 15-20 
per cent; VI or F, not stated. 

The actual distribution of grades at Oregon is skewed a little 
the upper end of the scale, a fact which might be explained on the 
hypothesis that Oregon has a preponderance of above average stu- 
dents, but is probably an expression of the leniency of most professors 
in g^iving more than the expected proportion of students the benefit 
of scholastic doubts. The percentages of each grade assigned through- 
out the university are quite constant from year to year, and at the 
time of this study were 1 = 9 per cont, 11=23 per cent, 111 = 35 per 
cent, IV = 19 per cent, V = 9 per cent, and VI or F = 5 per cent. 
Since studies of college marks for large numbers of students over a 
considerable period of time at many representative institutions have 
empirically demonstrated a uniform tendency to approximate the 
normal curve in such distributions, we have determined the numer- 
ical weight to be assigned each mark by determining the mean devia- 
tion in a unit normal distribution of each portion corresponding to 
the actual percentages of each grade assigned at the University of 
Oregon (4). Of course, departments differ considerably in the per- 
centages of students to which each mark is assigned, and there are 
selective factors w&ich in some cases offset and in others aggravate 
the inaccuracies suggested by these differences. But since marks 
have been paired by departments, such errors will be practically 
constant for each individual in both sets of measures, and hence 
should not detract from the determination of their relative reliability. 
Moreover, our findings in general agree with those of Spence (8) 
when he says that the improvement produced by correcting for the 
Variations due to different percentages of g^rades assigned by different 
instructors is not large, and again that j^hile there are significant 
differences in the intellectual level of various classes, correction of 
grades for such differences at present makes very little difference in 
the final composite for any student. 

The standard deviation values for each category of the grading 
scale according to the percentages of each grade actually assigned 
came out I = + 1 .82, II = + 0.86, III = + 0.04, IV = - 0.69, V = - 1 .27, 
VI or F= —2.40. It can be shown mathematically that the relative 
value of tn^ weights remains the same when the constant 2.4-is 
added to eacl^^M as to avoid the use of negative numbers in com- 
putation.* Tm weights thus computed would be 1^4.2, 11 = 3.3, 
IV=J.7, V=1.1, and VI or F = 0. The multiplication of 
these by another constant would distribute them proportionately 
over the interval from 0-5, giving 1 = 5, II -3.9, 111 = 2.9, IV = 2.0, 


• W# ttra InttobiUd to Dr. W« E, MOne for o pronf oC thil propoilUoo. 
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J ‘"y f' ^ traditional weighting of grades for 

qua already in use at the University of Oregon where an hour 

of v'=l S’ I ^^' = 2 points, 

^ Statistically 

etenm^d values as could be expected without resorting to deci^ 

mals We have therefore used these traditional weights throughout 

our study. Marks of “Incomplete” and “Dropped” were ignored. 

In passing It may be noted that these traditional weights peaali^e 

the poor students somewhat and award a smaU extra- premium to 
the very capable ones. ^ ium w 

In all our computations the grade-point ratio— average number of 
pbinte per hour-has been taken as the measure of student scholaaUc 

ouufruoTth. b ‘r,"' ®f estimates of the average 

qualUg of the soholastic performance of etudente which can be inferr^ 

thTer ^‘‘T' etefement (12) that 

he comhmat,on of quantity and quality indices results in an indet 

that has greater rehahdity than either quantity or quality indices 

taken separately, but we preferred to evaluate the reliability of quality 

nsleb /'““if' “■! of scholastic work 

^ psychologically very different matters which we do not wish to 

confuse, ^cond, mth 70 per cent of our students self-supporting in 
*° part, mconstant environmental pressures rather than 
definite personal characteristics are likely to determine the amount 

Thild •T'*' “‘"‘® ““rfioolify of work done. 

, u en e mtely attempt to fit their scholastic load to the 
engencies of economic opportunities which vary from quarter to 
quarter. Again, it is the quaUty of work with which we are most 
concerned rather than the regularity with which credits pile up in the 

and multiple correlation shows that average quality 

ia 2-3« times as important a 
factor in total grade pomts eamed as is the number of hours carried. 

bvTJ^^rr ' "‘““‘y of “liege marks were determined 

by correlating the average quaUty of two essentially similar samples 

of e^h student s work during alternate quarters of L first two years 

" "• ‘“u nimUarity in the abLes 

and rnstmction required, the marks of each student were segregated 

dwlT d e«l> division was compared with aU work done in the same 
division during the even quarters. These divisions were as follows- 
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. 1. Architecture. 

Fine arte. 

Normal arte. 

2. Botany. 

Zoology. 

3. Business administration. 

4. Chemistry. 

5. Economics. 

6. Education. 

7. English. 

8. Geology. 

9. German. 

Greek. 

Latin. 

10. History, 

11. Journalism. 


12. Mathematics. 

Physics. 

13. Music. » 

14. Philosophy. 

Political science. 

Sociology. 

15. Psychology. 

16. Romance languages. 

17. Military science — Men (required). 

18. Physical education — Men (re- 

quired). 

19. Personal hygiene — Women (re- 

quired). 

Physical education — W omen 
(required). 


A few departments were consolidated with others of perhaps ques- 
tionable similarity in order to get enough ca.se8 to give reliable deter- 
minations of the correlations. The computation for the record of one I 
individual is given below to illustrate the methods used. 


Department 

Number of hours and ' Number of hours and ■ 
grades first year, by | grades second year, * Odd terms 
terms 1 by terms 

1 1 

Even terms 

Fall 

(No. 1)1 

1 

winter 
(No. 2) 

1 

Spring 1 Fall 
(No. 3) : (No. 4) 

1 

Winter 
(No. 6) 

1 

Spring 
(No. 6) 

Points 
1 over 
1 hours 

Grade- 

point 

ratio 

Points 

over 

hours 

Orade- 

point 

ratio 

1 

2 

3 

1 

4 5 

1 

6 

7 8 

9 

10 

11 

Educttion 

1 

1 

1 

1 

3V 

1 

3 IV ! 
, . 1 

311 

1 

ft/3 

20/8 

20/12 

»10/4 

0/4 

12/8 

12/4 

1 

1 

2.0 
2.6 
r.7 
1 2.8 

2.3 

1.6 

3.0 

2.1 

16/6 

8/4 

32/13 

2.6 

3L0 

2.7 

Knslish 

4 IV 1 
4 IV i 

3 nil 

4 V i 
till 1 

liv 

.4 IV 

4 IV 

4 III , 

History 

JoumiJinn > 

4 V '4 111 
2IV> i 

4 IV 

4 III 

Piycbolosy 

Romance languagos. 
Phyiical education. . 

Weighted mean for 
all oourset 

1 IV 

IF 

1 3 III 

1 inc. 

. 4 IV i 3 V 
11 1 III 

IIV i 

1 III 4 III 1 
3 1V 1 ! 

1 inc. 1 I 

21/7 

7/7 

10/4 

3.0 

1.0 

12 

i 

1 

' 1 

! 



.1 Not 08 «d b««uM can not be paired. < 


The results appear by departmeDts and for the university as a 
whole in the following table. 

107121—32 2 
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Reliability coeJBicienU of University of Oregon grades 

(Oradfa expressed In terms of grade-point ratio; based on 2-ycar records of freshmen onterlnK in 102M 


iliv (ob. 

tamed) * 

j 

i 

riVI*(es. 
! limated) 

ril • (esti- 
mated) 

1 

ril • (ob- 
tamed) 

Number 
of cases 
used for 
obtained 
nlll 
Ai 

1 

Average 

number 

of 

quarters 
paired 
for nil I 
(used for 
Mf i- 
mated 
riD 
n 

a 

1 

3 

1 

4 

5 

6 

7 

0.09 

0 82 

1 

0. 62 

0.46 

119 

2. 08 

.80 

.89 

.67 


123 

1.9.3 

.78 

.H8 

.60 

.57 

101 

2.3V 

.71 

.8.1 

.67 


86 

1.86 

.71 

.83 

.61 


126 

1. 50 

.68 

.81 

.60 


138 

1.42 

.77 

.87 

.60 

.36 

318 

2.23 

.58 

.73 

.46 


60 

1 r>9 

.90 

.95 

.79 


76 

2. 3H j 

.75 

.86 

.63 


157 

1.73 i 

.09 

.82 

.50 

1..: 

48 

2. 24 

.61 

.76 

,48 

„ 

90 

1.71 

. 00 

.75 

: .38 


62 

2,37 

.56 

.71 

1 

1 . .46 


121 

.... 

.67 

. SO .67 


153 

1.50 

.79 

.88 

! .61 

.50 

261 

2.48 

.TiO 

.74 

i 

.26 

134 

2.22 

.67 

.81 

; .43 

.48 

IRI 

2. 76 

.61 

.76 

1 .35 

.27 

212 

1 2.89 

.85 

.92 

1 .66 

.64 

184 

3.00 

.90 

.95 

' .76 

.65 

212 

' 3.00 

.89 

.94 

i 

.67 

396 

1 3.00 

.015-. 058 
.007-. 013 

.010-.047 

.004-.008 

i .021- 060 

.014 024 

.038-.077 
. 019-. 029 


1 

1 

1 


Department 


Architecture, fine arts, nor- 

Dial art 

Botany, loology.. 

Business admlnistratiuu 

Chemistry 

Economics 

Education 

English 

Geology 

German, Greek, Latin 

History 

Journal ism 

Mathematics, physics 

Music 

Pb llosoph y. political 

science, sociology' 

Psychology 

Romanoo languages 

Military science 

Physical education: 

Men 

Women 

Total for men 

Total for women 

Total for all students ... 

PEf (probable error) 

For departments 

For totals 


Number 
of cases 
used for 
obtained 
ril 
Nx 


51 


174 


176 

66 

150 

m 

164 

212 

306 


odd 


•rilll -coefficients of correlation between (a) avera^ grade-point ratios for courses taken in 3 
quartan and (h) average grade-point ratios for courses ti^en in 3 even quarters. 

f#Vl— estimated correlation which would be obtained if grade-point ratios for 6 odd and 0 oven 
quarters were available. ^ ^ 'Vw ^ . 

fil (estimated) ••estimated correlation which would be obtained between (or ®rade-polnt ratio for I 
quarter and (6) grade-point ratio for the corresponding quarter in the following year. 

'fil (obtained) -actual coefficient of correlation between (a) grade-polot ratio for 1 quarter and (5) grads-^ 
point ratio for the corresponding quarter in the following year. 


The -lirect comparison of the reliabihty of grades in different 
departments, from this table, is probably not justified for two reasons. 
First, the range of grades assigned is not the same for all depart- 
ments, and correlations increase with increased range.^ Second, it 
is always possible that the apparently high reliability of grades in a 
given department is spurious because instructors make a practice 
of exchanging opinions about their best and poorest students and 
hence produce an agreement in marking, throughout the department, 
based more oif the departmental reputation of the stu^nt than on 
his actual performance in different courses. 

We have used the Spearman-Brown “prophecy” ^mula to esti- 
mate the correlation of aix quarters of college work with six similar 
quarters, i. e., the reliability of grade-point ratio for lower division 
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work as a whole. Like^’ise we have estimated the correlation of 
^ade-point ratio for a single quarter with another similar quarter, 
i. e,, the reliability of grade-point ratio for a single quarter. Wherever 
the number of cases available warrants it, we have computed the 
actual correlafion of the work done in different departments and in 
the university as a whole during the winter quarter of 1925-26 and 
the next winter qua 4 *ter of 1 926-27. 

For the university as a whole three things are rather outstanding. 
(1) Women are more consi.stent in their scholarship than men. 
Three quarter grade-point ratios correlate rjIII^O.SS for 184 men, 
and r3lll=0.90 for 212 women. Since this difference is more than 
twice the standard deviation of such differences, the probabilities are 
98 in 100 that it is a real difference, i. e., not due to chance errors in 
sampling. (2) The reliability of cumulative estimates of average 
scholarship from various departments is surprisingly high. Low cor- 
relations betw'een test scores and college grades are often attributed 
to the low reliability of the grades. This is reasonable enough where 
the correlations are wth grades in single courses or even with single 
quarter grades. But low correlations with averaged grades for three 
or more quarters of college work can not be e.xplained away by dispar- 
aging the essential consistency of the grades. The relative accuracy 
of college marks within a given college group is quite certainly equal 
to that of our best psychological tests if fairly independent estimates 
of scholarship are averaged for as much as two years, (3) Tublished 
determinations of the reliability of college grades have been computed 
on such different bases that comparison is difficult. Toops (10) got 
reports on the correlation of marks for successive semesters in 17 
colleges. The average was r = 0.66. McPhail, Kornhauser, Cleeton; 
Crawford, and Wood (1, 2, 5, 6, 13) have reported very similar find- 
ing. At Oregon we found a correlation of r = 0.67 between grade- 
point ratios for two similar quarters a year apart. We therefore 
venture to consider our finding typical of college marks in general. 
Correlations between the marks of different quarters (and probably of 
semesters also) understate a little their actual reliability as measures. 
The difference is surprisingly small, r,I (estimated) =0.73 instead of 
ril=0,67, for a single quarter of lower division work. Hence, in 
general, correlations between excellence of scholarship in various 
terms approximate, but rather definitely understate, the degree to 
which college marks succeed in identifying individual differences in 
attainment. 
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IMPROVEMENT OF THE ESSAY TYPE EXAMINATION 

By R. W. Leighton ‘ 

Tn this discussion of the improvement of (he essay examuiation the 
term essay IS used as an inclusive term to designate all examinations 
wmch leave the form and the content of the answers to the student 
tjviimg the exanunation. This definition of the term is intended to 
exclude all forms of the so-called new t vpe or objective examination, 
and It IS intended to include all fomisof written examinations from those 
w Inch c.all fur the developmen t of a topic or the writing of a co mposi tion 
to those which call for a bare enumeration of facts. Likewise, these 
examinations may call for the exercise of any particular mental acti\ity 
the examiner may wish to invoke. 

For some time, but particularly since the advent of the new type 
examination, the essay examination has been vigorously attacked as 
an inaccurate measure of student achievement. Most studies of such 
examinations show that this criticism is usuaUy warranted, yet the 
^say type is still the one most often used in an institution of higher 
leammg offering such courses as those which ar© offered here at the 
umversity, probably because of its extreme flexibility in adaptation 
to the measurement of the ability of a student to exercise different 
mental activities in the various subject fields. Whatever the reason 
for Its popularity may be, the fact that it is the measuring device 
most often used warrants a careful study of its weaknesses; hence the 

purpose of this paper is to offer the result of some study of two of 
these weaknesses. 

There are only two weaknesses of the essay examination which 
matenally lower its value as a measuring device. These are : First— 
Its subjectivity, or the inability of different judges rating papers 
wntten for such exammations to agree as to the value of these papers 
as evident of successful achievement, and second— their low' capacity 
for sampling the student’s knowledge in a given subject field. In 
other words, the number of questions is so limited that students do 
not have equal opportunity of.showing their achievement. Certainly, 
either or both of these weaknesses make the essay examination a very 
poor measure when they are present to any great degree. 

The new type or objective examinations owe most of their success 
' to the extent to which they eliminate these factors and any improve- 
^ m ent of the ess ay must also b<^;in writh their elimination. 

pi. UalT«.t,orOTW. 
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. During the last three years essay examinations have been studied 
here at the university whenever it was possible to get two or more 
judges to rate the papers. These judgments were then correlated in 
order to determine a mathematical coefficient which could be used to 
represent the reliability of the judgments or ratings. The first 20 
examinations used were drawn from four different departments which 
used essay tests and the judgments showed correlations ranging from 
r=0.36 to r-0.65. Four of these r’s were below 0.5 and the other 
16 ranged from 0.5 to 0.65. . 

Study of these examinations revealed the fact that a large nurnber 
of the variations in the scores a.ssigned by the different jtidges were 
due to two things, namely, lack of agreement as to what the question 
asked for, and lack of agreement as to what the answer should be. 
In the one case this was due to poorly worded questions and in the 
other to incompetence of the judges. Graduate students had acted 
as judges in most cases and their judgments did not agree with the 
judgment of the instructor usually because they were prone to assume 
that one answer and one only could be correct, even for questions 
which involved controversial points. There seemed, to be no reason 
to suppose that such variations could not be eliminated nor did there 
seem to be any reason for classing them as factors of subjectivity. 

The next step undertakeuf was to make similar comparisons in 
normal situations in which these factors had been minimized as 
much as possible, and by this means attempt to determine an index 
of the lowest point to which subjectivity alone need reduce the relia- 
bility of judgment, also to determine an index which would represent 
the greatest possibilities for the reliability of scoring essay examina- 
tions. These indices, if they could be established, would give a rather 
definite picture of the operation of the subjectivity factor. Accord- 
ingly a test in philosophy was chosen as the most subjective test to be 
found among'all term tests, and a test in plant biology was chosen as 
the least subjective essay test it was possible to find. 

The philosophy test was one given as a final term test. It required 
two hours to finish, but no time limit was set, and it contained 10 
questions which' were to be answered separately. There were 32 
papers to be rated. The test calledlfor no factual material for pflr- 
poses of evaluation; instead the judges looked for such things as rev- 
elation of the extent of a student’s reading with regard to his ability 
to choose his readings wisely, and his ability to grasp philosophical 
problems. These and other equally subjective factors made up the 
criteria for rating the papers. 

‘ No smgle questions were scored or weighted on any paper, but each 
paper was scored as a unit, as worth a 1 , a 2, a 3, a 4^ a 5, or a failure. 
The judges were the instructor handling^he class and an instnictor 
from another department who studied extensively in the field of 
phiUwophy. 
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ambiguity in the questions, and the 
m tructor handlmg'the cIhhs very carefully explained to the second 
Mge just what the cntenon of judgment should be. Both held as 
closely to this cntenon as they could while rating the papers. 

The foUowing conditions, then, e.xisted for this comparison of judg- 
ments or scoring The esamination was designed to measure what 
are eommonly called extremely subjective fnet^. 

■ carefX".^ “P “ 

The studento were known to one Instructor and were not fcniwn to 
the oiner eo that any availed “halo " effect which is pri^nt in a 
noTOal situation was present and operative in this case. 

narmw“T ' ‘•““go »f «™ding was 

ibJr "ted ‘he papers, but the instructor handling 

the class was necessarily the more competent of the two ® 

The correlation between the resulting grades was r- 0.63, which is 

waTntk^rrrh ■"'’™"“'-d that this’sUusUon 

w« Tbe e “''o" of judgment 

wbiel, and it may be assumed to be the lowest pmnt to 

Tf^hte e’* " "'“‘’■“‘•'t »' i^dgment in th^case 

of these examinations. 

The second examination used was an examination in plant biology 
This examination consisted of five questions designed to meS 
specific knowledge acquire! in the l.boratory parTof L “ 

fivelabrar P‘P''’ ™‘«d firsTby 

oX Tb. ««s.8tat8, each of whom rsted ono question and one 
'“f'^at'on was prepared by the instructor in charce of 

™ted ??i ““‘™<'dons as to how each question was m be 

rated, were carefully stated by ttta instructor. Each judge wa* 

This was the usual proceduJS 
•’ccause it was the least deviation from 
P™««d'™ »nd because it was sn excellent method of obtain- 

rehabihty of these judgments the papers were given to a graduate * 
udent who wa. not connected with the course md who knew C 
of the students from the standpoint of their classroom work This 
^cond judge rated all the questions on all the papers but foUowed 
#ie same instructions given to the first judges, and also rated the 
qu^tioM one at a time for aU papers; that is, she rated aU answers 
to queshon 1 before rating any answers to question 2. 

jud^eilu"”™* ““P^i-” of 

rnmd ? ™ «l«igned to meseure whst are commonly 
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The critena for judtrment were definitely explained to the judges. 

The students were known to the judges in one rase and not in the 
other. 

There was a large number of students involved and a wide range 
of scoring 

The revelation found was r = 0.90, which w as the highest correlation 
of any kind found in the stiuly of both essay and objective examina- 
tions used ns, final term measures. It ninv be assumed to be an index 
of how high the reliability of judgment may go in the case of the 
examinations studied. 

These two examinations represent the greatest range for the effect 
of subjectivity upon the reliability of judguieiit that could be found 
among the term tests of the essay type. And the total range lies 
between the correlation coefficients 0.63 and 0.90. '* 

The results d^*»<ribed so far offer rather definite evidence of the 
following: (1) That the range of the effect of subjectivity in scoring 
essay tests ismot so great as is often supposed. (2) That the relia- 
bility of judgment increased to a marked degree under the following 
conditions — when the factors to be measured are carefully determined 
' before writing the examination; when the questions used are carefully 
planned to meet the ’requirements of the examination and of clear 
statement; and w^hen the criteria for ftidginent are carefully set up. 

The extra work imposed uj>on the instructors so far as actual 
assigning of grades was concerned was not con.sidered by them to be 
significant. 

Another illustration 'will show rather well the influence of these 
factors in improving the reliability of judgment in scoring this type 
of material. It happened* that 115 compositions had been written 
by entering freshmen. In this case the choice of topics for these 
compositions was not well controlled and the judges were simply asked 
to rate them in on order of merit as evidence of the ability of the stu- 
dent to undertake college English. The judges were instructors han- 
dling the English courses involved. In this case, therefore, there was' 
some ambiguity in the assignment and no criteria of judgment>,were set 
up; instead each instructor rated according to his own ideas. The re- 
sulting correlation was low as such correlations may be expected to be 
when careless procedure is permitted — the r was 0.36. It was desir- 
able to use such compositions as the criterion in another study so 
one of the instructors worked out a scale for their measurement. 
One hundred and forty-twq people were then asked to write on a 
given topic which was carefully chosen and, carefully assigned, in 
order that each student might have nearly equal opportunity. The 
compositions were then carefully rated on the basis of the new scale 
by two high-echool teachers of English, one of 10 years experience 
and one of 5. years experience. The correlation of the r^ultmg judg- 
I mente was 0.88. 

ERIC L__ _.J! 
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In this case wo have again a great increase in agreement resulting 
when the directions for writing the compositions were accurately 
planned and when a definite criterion for judgment was used by the 
judges. There was no reason to believe that there were any great 
differences in the abilities of the four judges involved. 

It will probably be pointed out that Starch and Elliott, Ruch and 
others found much different results than those just quoted and it would 
be well to keep in mind certain things in connection with those studies. 
They experimented with one paper and a group of judges while these 
studies used many papers an<i a few judges, also their conclusions 
are based upon the numerical ratings on the one paper used. Such 
ratings may have some value but the value of the ratings would 
have been much more significant if several papers had been graded 
and then tho ratings compar od to see how well the teachers had agreed 
HH to which papers were best, ne.xt best, etc., for noruiully grades are 
determined by comparative value rather than by percentage values. 
Probably a for different story would have been told had this been 


Second, there is no evidence that a very definite criterion for judg- 
Inent was offered. These same experimenters w ere later very caret 
of that factor when building objective tests. • Their keys to objecti 
tests are carefully worked out and rigidly adhered to in scoring 
such tests. 


The findings offered here appear when the judgments are directed 
and controlled in thelsame way when the essay test is involved that 
they are directed and controlled when the new type lest is involved. 
In fact the procedure was deliberately copied from the procedure used 
with the new type tests. In every case the instructor in charge of 
the work decided what was to be measured, wrote questions which in 

18 judgment Would measure it, and, finally, definitely outlined a key 
or criterion by which the residts ^re to be judged. The instructor’s 

judgment is always final in (^is way for ordinary term tests, no matter 
which type teat is used. 

The second weakness of the essay test W'hich was mentioned earlier 
(the matter of inadequate sampling of a student's knowledge in a 
given field), does not apply to all essay tests. Foi example, it does 
not apply m the case of the second group of compositions described, 
neither does it apply to any extent to the philosophy eijamination, 
for the students were given opportunity in both these cases to express 
themselves concerning subjects with which they had to be familiar 
if they were to be considered prepared at all. 

When the sampling factor does apply (as it did in^the biology test 
quoted) it is difiScult to obtain a coefficient which repn^sents its actual 
effwt upon results. We biay assume that the correlation of chance** 
halves of the f^t (as represented by odd v. even numbered questions) 
can not be high if great errors of sampling ore present! But this 


done. 
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correlation is complicated by the reliability of the judgment in rating 
the test and also by nearly all other iw^eaknesses that may be present. 
However, if a situation can be found in which these other weaknesses 
are red^jced to a minimum, the correlation of chance halves of the 
test should be affected chiefly by sampling. On this assumption 
*. chance halves of the biology test mentioned earlier were correlated 
and the resulting r, when stepped up by u6e of the Spearman-Browm 
5 , formula, was 0.87 so it would seem safe to assume that adequate 
sampling is possible when care is exercised in handling the examina- 
tion. A large number of other essay tests were treated in the same 
manner and gave r’s ranging from 0.09 upward, with the median 
somewhere in the 0.50’s. However, seven carefully handled exam- 
inations were available which gave r’s ranging from 0.67 to the one 
quoted before, namely, 0.87. As pointed out, the error of sampling 
is exaggerated when this method of measuring that factor is used, 
yet it is about as good evidence as can be obtained. Still further 
study has shown that these correlations of chance halves of essay 
tests increase materially when a large number of questions, requiring 
onlv short answers, is used in these tests, a fact which need only be 
mentioned to be self-evident. 

This study has not been an attempt to treat comprehensively all 
* of the factors which limit the value of the essay examination, but 
rather an attempt to deal with two of its principal weaknesses. Its 
other weaknesses are weaknesses of most testa, whatever the type, 
and should be handled as weaknesses of examinations in general. 

SUMMARY 

The studies described give evidence of the following; 

1, Much of the inaccuracy of rating essay examinations heretofore 
credited to subjectivity has been due to carelessness in writing ques- 
tions and in setting up criteria by which answers to these questions 
were to be judged. 

2. The effect of subjectivity upon scoring is not so great as is com- 
monly supposed and it need not reduce the reliability of judgment 
below a reasonable limit. In some cases it becomes of negligible 

. importance. 

,3. The essay examination yields readily to the techniques of im- 
provement which are used to improve new type examinations, and 
experience with the studies described in this paper indicates ^that 
greater results are obtained for a small amount of work in the case 
* of the essay examination than for the same amount of wOTk in the 
case of the new type. 

4. The sampling error is not present in many types of essay exami- 
nations and when it is present its effect is reduced to reasonable luiuts 
by care in selecting questions or by using a laiige number of short- 
answer questions. 


GROUP II.— STUDENT PER^NNEL STUDIES 


AN EVALUATION OF CERTAIN TESTS AND INFORMATION 
FOR PREDICTING SUCCFJ^S IN NORMAL SCHOOL 

C. C. Upshall * and Harry V. Marters * 

The problem of this paper is to evaluate certain tests and informa- 
tion for the purpose of predicting the following; (1) Average first 
quarter grades; (2) average grades during the whole undergraduate 
period; (3) practice teaching success; (4) whether or not the student 
will graduate; (^) whether or not the student will receive a position 
through the Appointment Bureau of the State Normal School at 
Bellingham; (6) success in teaching in the field during the first 
semester after graduation. 

The Bellingham State Normal School gives eight tests to all 
entering students. These tests with working time limits are: 


• Mmutas 

1. Thorndike examination for high-echool graduates 60 

2. History 25 

3. English usage __ 20 

4. Arithmetic reasoning 12 

5. Arithmetic computation ^2 

6. Geography 26 

7. SpeUing 5o’ words 


8. Penmanship Two 4-minute testa 


Tfie Thorndike examination for high-school graduates is used in 
the jading sptem of the school. The tests in English usage, arith- 
metic reasoning, arithmetic computation, and spelling are used to 
measure the student’s ability in these fields. A minimum score (this 
score is a point ^nin us one-half sigma below the mean of the entering 
group) must be attained before a student is permitted to do practice 
teaching. At the present time only three retests are allowed in any 
one subject. If, after the third retest, the student has not attained 
the minimum requirement he is advised that he can not graduate 
from the Bellingham State Normal School. The histoiy test is not 
used directly at present. It is still in an experimental stage. The 


• C. C. UfwhaU, diraeter of tbi Butmu of R«Mvcb, Steto Nornul School. B«Uln«boin Wuh B A 
Unlr«lty of BriCUb Columbta. 1023; Ph D.. Columbim CnlvwiUy, 1020. B» wu fornlwly Initniotc^ 
to tbo IntarnoUoul lAltuto of TMcbm Coltot*. Md lUttotictoa for tbo Now York Conmjl«km oa 
VsoUlaUoo. 

» V. Mittora. MWMSlota dlroctor of tbo Burwu of Ronvch, BUto NormoJ School, BoUfnfbua. 
Mb. B.A., Wcttvn ttotan CoUoco, 1034; M.A., Univonity of lows, 102ft, I'b.D., 1927. Pubticotloa* 

‘A Study of SpoUln* KirSt.” Uofocroffnf /oiM StodfM to ««kM1oii. 
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geography test has been dropped because it seemed to duplicate cer- 
tain of the other tests. If a student does not attain the m inim um 
score in penmanship he must take a course in this subject before he 
is allowed to do his practice teaching. 

The reliability of these tests is quite high. Table 1 gives the relia- 
bility coeflBcients for each of the tests with the exception of spelling. 
The reliability of all the tests with the exception of the Thorndike 
examination was computed by means of the split-halves technique 
and corrected by means of the Spearman-Brown prophecy formula. 
The reliability given for the Thorndike examination is taken from 
Wood (1).^ 


Table Reliability coefficients of the tests given to students entering the State 

Normal School at Bellingham 



Tbomdike examiOBtion. 

Iliftorv 

English usage 

Arithmetic reasoning 

Arithmetic oomputation 
Geography. 


I 


0. 85 



. W 

O.Ol 

197 

.92 

.00 

43« 

.72 

.03 

150 

,82 

.02 

150 

.80 

.01 

405 


The majority of the grades which are given in the Bellingham State 
Normal School are based almost entirely on objective tests which are 
made by the teachers and scored under the direction of the Bureau 
of Research. The reliability of the composite scores based upon 
these objective tests has been computed for several courses. The 
reliability of these composite scores ranges from 0.24 to 0.96. The 
median reliability is 0.85. Each instructor, of course, has the liberty 
of making changes, based on additional information, in the distribu- 
tion derived from the objective tests. In some subjects very few 
objective tests are used. It follows then that the reliability of the 
grade in the typical course will probably be something less than 
0.85. On the other hand, average first quarter grades will have a 
reliability that is superior to the grades given in a single course. 

The first problem in this study was the selection oi a group from 
which a regression equation could be derived to predict average first 
quarter grades. A hundred students, selected at random from the 
freshmen who entered in the fall of 1927, were chosen. All the 
tests, with the exception of penmanship, were used in an endeavor 
to predict the average first quarter grades of this group of 100. 
These tests wm% the Thorndike examination, history, English usage, 
spelling, arithmetic reasoning, arithmetic computation, and gpog- 
raphy. In addition the age of the students at high-school graduation 


* Nombun in pwl th ana i nim to " BlbUognpliT." p. a. 
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was used. Table 2 pves all the intercorrelations for these tests 
with average first quarter grades. 


Table 2. — IrUercorreUUions for the 1927 group 


Variable 

1 i 2 

3 

4 i S 

j ® ^ i 

1 

8 

9 

Average first Quarter grade 

Age at hlgb-scnool graduation 

Toomdike aoore 

— -0. 43 

1 

a68 

-.30 

0.57 1 aa« 
-.14 1 -.19 

1 

1 1 

0. 44 0. 54 

-.44 -.20 

.36 .59 

.31 .34 

. 34 , .21 

‘ P 

SSSSSKS 

a54 

-.28 

.68 

.77 

.20 

.38 

.51 

.36 

■ 

1 

Hlstorv 

English usage ! 

Spelling 1 

Arithmetic reasoning ... 

Arithmetic computation 


Geography 

• 



It was loujid that of these factors only the Thonidike e.xamiiia- 
tions, history test, and age of high-school graduation were significant 
in predicting average first quarter grades. The others added prac-i 
tically nothing to the prediction. The regression equation, using 
these thrro predictive factors, is — average 6rst quarter grade = — 0.014 
time? age of high-school graduation in months + 0.036 times the score 
on the Thorndike examination + 0.012 times the score on the history 
test + a constant of 0.26. Table 3 show's the means, sigmas, and the 
regression equation weights for the four variables used in the above 
equation. 


Table 3. — Meant, gigmat, and regrettion equation weights of the variablet used in 
the prediction of average first quarter grades (19S7 group) 


Variable 


1 Mean 

Sigma 

Weight 

1 


i 2 

3 

4 

Average first quarter grades 

Thorndike examination 

Age of graduation from high school 

History 


.| 116.6 
16-3.8 

0.8 

10.8 

10.8 

18.3 

0.’038 

.014 

.012 


The coefficient of multiple correlation from which the regression 
equation is derived is 0.762. The standard error of estimate is 0.5 
of a grade point. 

In order to find the value of this equation in predicting average 
first quarter grades for a new group, all the students who entered as 
freshmen in the fall of 1928 were chosen for investigation. The 
average first quarter grade of each student was predicted and the 
correlation between actual average first quarter grades and predicted 
first quarter grades was- computed. THs coefficient of correlation 
is 0.71 ±0.02. It is seen that there is a decrease of 0.05 when the 
regression equation based on the scores of the 1927 group is applied 
to the 1928 group. This coefficient of correlation may be interpreted 
as being 30 per cent better than chance. 


24 


RESEARCH IN HIGHER EDUCATION 


There were 275 students who entered as freshmeD during the fall 
quarter of 1928. Of these 125 were graduated at the end of the 
spring quarter 1930. The coefficient of correlation between 4>re- 
diicted average first quarter grades and actual average first quarter 
grades of these 125 students is 0.61 ±0.04. For^this selected group 
there is a decrease of 0.10 point in the correlation from thaF found 
when the 275 students were used. However, as is shown in Table 
4, the standard deviation of the 125 group is only 0.47 whereas the 
standard deviation of the 275 group is U.74. 


Table 4. — Coeficienta of correlation beixoten predicted grades and actual grades 


Predicted Boon 

Average 
Ant quarter 
grades 

Sigma of 
predicted 
scores 

1 

2 

3 

1027 ffTniin _ _ ............................... 

a7Adbao3 


lOQD nil smtr&fitii 

.7i±aoa 

a 74 

IfiCSO ETHdUAtM _ - - 

.ai± .04 

.47 

1090 ETAdUAtAI ... .............. 

1.69^ .04 

.47 





> ATernfe grade during six quarters. 


The predicted average first quarter grades for the group of 125 
were correlated with the actual average grades received during the 
entire period of attendance. This coefficient of correlation is 0.63 i 0.04. 
The standard deviation is 0.47. ^ 

In order to determine the relationship between average first quarter 
grades and the ability of the student to graduate the biserial r tech- 
nique was used. Table 5 gives this relationship. Table 5 also gives 
the biserial r between graduating and predicted first quarter grades 
,and between graduating and the Thorndike score. 

Tablc 6. — Buerial coeffictenU of corrtlalion between graduating and average firel 
quarter grades, predicted first quarter grades, and Thorndike scores 

OndiMtins or not 


Average first quarter grades 0. 61±0.04 

Predicted first quarter grades - .37± .06 

Thorndike score — . 30± .06 


Average first quarter grades are distinctly better than either the 
Thorndike score by itself or the predicted first quarter grades based 
on the Thorndike score, history score, and age of graduation from high 
school in estimating whether or not a student wrill graduate from the 
Bellingham State Normal School. However, ntme of the coefficients 
' is sufficiently high for use in predicting an individual case. The pre- 
dicted first quarter grade and the Thorndike s^re are practically 
useless even for the prediction of group achievement. 

The tests have proved fairly satisfactory for predicting average 
first quarter grades. The average grades for the entire course of 
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those who graduate are reasonably w6ll predicted. The next question 
to be discussed is the value of the tesU for predicting the rating stu- 
dent will receive in their practice teaching. Table 6 shows the 
coefficients of correlation between practice teaching grades and each 
of the tests that were used for predicting average first quarter grades. 
In addition the coefficients of correlation between practice teaching 
and (1) teaching success in the field during the first semester after 
graduation, (2) average first quarter grades, (3) estimated first quarter 
grades, and (4) all the tests combined are showm. 


Ta^» 6.— Zero order coefficient, of correlation existing between practice teaching and 
certain other variable, and between teaching ,uccea» and certain other variable. 


1 

• Pntctioe 

teaching 

Teaching 
1 success 

ct 

1 ■ 

1 

1 

1 

1 

! ® 

Practice teaching. . 

a27^M 

-.6^ ,M 
.02:i; .OS 
.03± .M 
.06^ .M 

Average hm quarter grades * 

Predicted first quarter grades » 

All testa combined • Ofi 

Age of hlgh-echool graduation ^ 

A rltnmetjo reasoning .uo 

Arithmetic oomDUtation ^ 

Engliah usage ----- - — .. • 23±: . Ofi 

History — , -13^ .00 

— • 18:1: .08 
.14:^. 08 

.08zfc .06 

-.04± .00 
-.04± .08 
.Oe^ .08 


The highest cbefficient of correlation is between practice teaching 
and average first quarter grades.' This coefficient is 0.45 ±0.05. 
The coefficient of correlation between practice teaching and pre- 
dicted first quarter grades is 0.36 ± 0.06. When all the teste are com- 
bined into one s^re and this score is^correlated with practice teaching 
the correlation is 0.44 ± 0.05. It seems that none of the tests adds 
anything to the prediction of practice teaching after average first 
quarter grade has been used. The coefficient of alienation of 0.45 
18 0.89. This means that practice teachiag may be predicted from 
average first quarter grades with an accuracy only 1 1 per cent better 
than chance. Our teste do not predict pracUee teaching grades 
nearly as well as they predict average first quarter grades. 

Perhaps this lower correlation may be due in part to a low reliability 
in the practice teaching ratings. While the reliability of practice 
ratings can not be determined with complete accuracy, it is possible 
to estimate, within certain limits, what this reliability is. The stu- 
dents who do their practice teaching are given a letter grade rating 
by their home-room teachers. They are also given a letter grade 
Tfiimg by a supervisor. These ratings are supposed to be made 
entirely independently. The teacher is obliged to rate each student 
on a rating scale which has been evolved by this institution. The 
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supervisors also rate the student independently on this same rsting 
BC^e. The coefficients of correlation between the ratings given by 
the teachers on this rating scale and ratings given by the supervisor 
will give an indication of the reliability of the practice teaching grade. 
Another measure of reliability of the practice teaching grade is the 
correlation between the practice teaching grade given by the teacher 
and that given by the supervisor. Two other indications of relia- 
bility may be. computed, i. e., the coefficient of correlation between 
the teacher’s rating and the teacher’s grade and the coefficient of 
correlation between the supervisor’s rating and the supervisor’s 
g^ade. Table 7 summarizes these coefficients of correlation. 


Table 7.—Coefficienit of correlation indicating the reliability of practice leaching 

grades and ratings 


FftClors 

1 

2 

3 

4 

, 

Mt iri0 

o.n±o 01 

0 . 80±a03 


vIatw' ■ vtaHa ... 

a 01 



d 88:^. 02 

ratlnff 

.SOdb .03 

.02 


.VM>^ .02 




.02 




1 



The low’est coefficient of correlation is that between the teacher’s 
rating and the supervisor’s rating. This is 0.80 ± 0.03. The highest 
coefficient of correlation is between the supenisor’s rating and the 
supervisor’s grade. It is 0.92 ± 0.01. Since the final teaching grade 
is the average of the teacher’s grade and the supervisor’s grade, it is 
to be expected that the coefficient of reliability will be slightly higher 
than those reported here. The reUability *of the practice teaching 
grade is as high as the reliability of the objective tests. Naturally a 
little, if not a large amount of this agreement, is due to similar stand- 
ards of judgment and similar educational philosophies. This agree- 
ment would not be expected between superintendents and principals 
in the held. 

The maximum correlation that could be expected between the prac- 
tice teaching grade and the tests in the light of the reliability of the 
tests and the reliability of the teacher ratings of practice teaching 
would be in the neighborhood of 0.90T. The correlation of 0.45 be- 
tween average first quarter grades and practice teaching is far from 
being as high as the reliability of the tests would allow. 

The rating scale which was used by the teachers and supervisors for 
indicating the quality of practice teaching was sent to the principals 
and superintendents who employed the graduates. Ratings by the 
superintendents and principals were returned for appro.ximately 80 
per cent of the students who received positions through the appoint- 
ment bureau since June, 1930. The coefficients of correlation in 
Table 6 between teaching success and the other variables are based • 
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n 108 for whom these field ratings were secured. It is seen 

® valuable single indication of success in the 

field 18 the ratmg given for practice teaching. This coeflicient of cor- 
relaUon, however is only 0.27 ± 0.06. It is barely significant, being 
only four and one-half times its probable error. In terms of accuracy 
of pre^c ion t^ is only 4 per cent better than chance. None of the 
other factore that were used to predict teaching success does so relia- 
bly All the coefficients of correlation are within three times the 
probable error of the coefficients. It would appear then that, so far 
as predict!^ teaching success is concerned, the tests and information 

used in this study are worthless with the possible exception of the 
practice teaching ratmg. 

by the superin- 
tendents and pimcipals should be secured in order to estimate the 

amount of correlation that might be expected between teaching suc- 
cess and the various other factors. An indication of this reliability 
IS bemg secured at the present time by means of another rating which 
m l be returned during May of this year. Until this measure of relia- 
bihty 18 soured it is impossible to teU how high a correlation might be 
expected between practice teaching and success in the field. 

It IS laterMtmg to compare ths results achieved in the Bellingham 

uZln iifjhri ‘*>0 results achieved by other institutions. 

UUman in the Journal of Educational Administration and Supervision 

L m 80t in <«. investiga- 

tion of 116 paduates <vho were teaching in junior or senior high 

schools. Table 8 pves the sere order coefficients of correla^n 
reported by Ullman for certain factore which are comparable with 
those which were obtained in t.hia study. 

T ab1i» ^.— Certain zero order coefficient, of correlation reported by UUman 


Factors 1 | , 

2 

1 ’ 

i 

5 

0 

Precttoe twcblfu I 

TiAchlof Aicoess * 

a» 

a24 
. 15 

a2o 

,80 

s80 

. . . p 

a23 

.20 

.20 

.61 

.60 

Brown psychologiiil exarnLutioD ■ * 

Academic marks * , • 

1 

15 ! 
:30 

w80 

.30 1 

rToraaiional marks * — — • — — 

Major subject marks 1 • ^ 

. 30 
,32 
.29 

.55 

,81 

1 



in correlation of 0.36 between practice teach- 

mg and t^hmg success, a coefficient of 0.15 between the Brown 
Psychological Exammation and success in the field, and a coefficient 
^ coiTOlation of ^24 between the Brown psychological e.xamination 

r46 ^ ^be extent of 

^46 with professional marks whereas it correlates only 0.26 and 0 22 

^th academic inarks and major subject marks, respectively. Ullman 

feports a coefficient of correlation of 0.30 between teaching success in 
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the field and professional marks. This coefficient of correlatioD is 
considerably higher than the one obtained in this study, i. e., 0.02. 
The difference may be accounted for, in part, by the fact that in rating 
junior and senior high school teachers more emphasis is placed on 
comprehension of subject matter than in rating elementary school 
teachers. In general, however, it may be said that the results reported 
in this paper agree with the results reported by Ullman. At present 
no single test or rating seems to predict teaching success at all reliably . 


Table 9. — Biterial coefficienU of eorrehlion bolween receiving a petition after grad- 


uation and certain other variablea 


Posit iOD 


First quarter grades. 0. 48±0lO5 

Predicted ^t quarter grades .23± .06 

Supervisor's rating .35± .05 

Thorndike - .26± .06 

Mean of first quarter grades and predicted first quarter grades . 45± .06 


Table 9 shows the biserial r between receiving a position through 
the appointment bureau and (1) first quarter grades, (2) predicted 
first quarter grades, (3) supervisor’s rating, (4) Thorndike examina- 
tion, and (5) the mean of first quarter grades and predicted first 
quarter grades. The highest coefficient of correlation is again with 
first quarter grades, i. e., 0.48. Predicted first quarter grades indi- 
cate very poorly indeed whether or not a student wiU receive a 
position. The Thorndike examination is almost as unreliable. The 
mean of average first quarter grades and predicted first quarter grades 
does not increase the coefficient of correlation that is found when 
first quarter grades alone are used. The supervisor’s rating corre- 
lates 0.35 vnth receiving a position through the appointment bureau. 

It will be seen that none of the coefficients of correlations reported 
in this study is of sufficient predictive value to estimate reliably the 
score that would be obtained by an individual student. In guiding 
individual students who attend the normal school, it is only possible 
to indicate their chances of success. It is seldom possible to be 
certain that a student will succeed or fail although the average first 
quarter grade of most students can be predicted fairly reliably. 
Whether or not a student vrill graduate from the Bellingham State 
Normal School can be estimated less reliably but still sufficiently well 
to be useful. Whether or not a student will receive a position through 
the appointment bureau after graduating can be predicted with 
some degree of accuracy. With the present reliability and validity 
of our tests and. ratings it is impossible to give the entering student 
any indication at all of what his rating as a teacher in the field will be. 

In order to be able to give information to the students which will 
be more meaningful to them than the coefficients of correlation 
reported in this paper, the information summarized in Table 10 has 
been prepared. 
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Tabub 10.— P«r cent of tho$» entering the BeUingham State Normal School that 

r the cent of thoee entering that receive pottltonA. orul the per cent of 

ihoBe who graduoie ihcU receive poeitiom 


Predicted first quarter foiade. 


Mean of actual first quarter grade 
and predicted first quarter 
grade. 

Tbomdlke examination 


Superviaor’i rating. 


2.0 to 2.9 inclusive.. 

1.0 to 1.9 inclualvo.. 
0.0 to 0.9 Inclusive.. 

8.0 to 3.9 inclusive.. 

2.0 to 2.9 Inclusive.. 

1.0 to 1.9 inclusive.. 
0.0 to 0.9 inclusive.. 

3.0 to 3.9 inclusive.. 

2.0 to 2.9 inclusive.. 

1.0 to 1.9 inclusive.. 
0.0 to 0.9 inclusive.. 

A ratings 

B ratings ^ 

C ratings 

D ratings 

K ratings.... 

iXL. 

3.5 to l9 Inclusive. 

8.0 to 3.4 inclusive. 

2.5 to 2.9 inclusive. 

2.0 to 2.4 inclusive. 

1.5 to 1.9 inclusive. 


fig- 
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Item 

Number 
entering 
in 1928 

2 

1 

> 1 Percent | Percent 

lUnge in grades gradu- i receiving 

iting in > poeitioDs 
, 1930 , in 1930 | 

" f ' 

Percent 
of tboee 
gradu- 
ating that 
\ received 
1 positions 
1 

1 

— — , 1 

3 4 ■ 5 i 

. 1 

e 

Actual first quarter gradp i 

1 

1 



Table 10 shows the distributions of (1) average' first quarter grades 
(2) predicted first quarter grades, (3) the mean of average first 
quartet grades and predicted first quarter grades, and (4) Thorndike 
scores for the 275 students who entered as freshmen in the fall of 
1928. The distribution of the supervisor’s ratings of 218 students 
who graduated during the school year 1929-30 is also given It 
gives for each level of achievement, for each of the variables, the 
per cent of those entering in 1928 who graduated in 1930. In column 
5 It gives the per cent of those entering in 1928 who received positions 
through the appomtment bureau, and in column 6 it gives the per 
cent of those who actually graduated who received positions. As 
would be expected from the coefficient of correlation between actual 
first quarter grades and ability to graduate those who receive high 
actual firet quarter grades are more likely to- graduate than those 
who receive low average first quarter grades. Sixty-five per cent of 
th^ students who receive average first quarter grades between 3 
and 4 were graduated, 54 per cent received positions, and 82 per cent 
of those that graduated received positions. At the other end of the 
distribution it is seen that none of those who received average first 
quarter grades between 0 and 1 was able to graduate. Only 25 per 
cent of those that received average firet quarter grades between 1 
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and 2 were able to graduate and only 14 per cent received positions. 
Those students, then, that received average actual first quarter 
grades below 2, have a relatively small chance of graduating, actually 
less chance than one chance in four. They have only one chance in 
seven of receiving a position. However, if these students graduate 
they have as good an opportunity to receive a position as those that 
receive average first quarter grades between 2 and 3. 

Sixty-five per cent of those w'ho received predicted first quarter 
grades between 3 and 4 were graduated, but only 46 per cent of this 
group received positions. Seventy per cent of those that w'ere 
graduated received positions. None of those students who received 
predicted first quarter grades between 0 and 1 was graduated. Only 
one in four of the students who received predicted first quarter grades 
between 1 and 2 were graduated and only one in seven received posi- 
tions. Forty-eight per cent of those who received A ratings on the 
Thorndike examination were graduated and 32 per cent received 
positions. (A rating of A is equivalent to a Score which is between 
one and one-half and two and one-half sigma above the mean of the 
entering group. A rating of E is equivalent to a score which is 
between one and one-half and two and one-half sigma below the 
mean.) Sixty-five per cen^* of those who* received B ratings were 
graduated and 44 per cent received positions. Twenty-three per cent^^ 
of those that received an £ rating were graduated and 15 per cent 
received positions. 

Of those students who received a rating between 1.5 and 1.9 from 
the supervisors none received positions. Forty-eight per cent of 
those who received ratings from the supervisor between 2 and 2.5 
received positions. At the other end of the distribution, of those 
,who received ratings between 3.5 and 4, 78 per cent received positions. 

It is possible, as a result of this analysis, to predict with practical 
certainty by the end of the first quarter that certain students will 
not be able to graduate and consequently will not be able to receive 
a position as a teacher. It is possible to tell what the chances of 
the other students will be of graduating and what their chances of 
receiving a position will be. There are only about three chances in 
five of the best students (as judged by actual first quarter grades, 
or predicted first quarter grades) graduating from the Bellingham 
State Normal School in the usual period of six quarters. It is not 
known how many of the students will graduate at a later date. It 
is possible that some of the better students attend the Bellingham 
Normal School for only a short period and then transfer to another 
institution of higher learning from which they were or will be grad- 
uated. It is doubtful whether those students who receive low ratings 
will graduate from this institution or from any other institution of 
higher learning. 
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Of tluo paper may be aumm.rized as 
1. A coefficient of multiple correlation of 0.76 ±0 03 was found 

ThorHib"'”^- “d the combing Lr^ ir“he 

I 9 M 0“ otudeata who entered in the faU of 

mm Te by means of the regression equation obtained 

coefficient of correlation between these 
0 7? fhat-quarter grades was 

taU ■o?^ 02 ‘‘«“°'*'** ‘’'oot^five of the students who entered in the 
1928 were graduated m the apriiig of 198(lr The 

It "'‘T” ^"^0 of til^leote: Zp and 

t e actual avenge first-quarter grades was 0.61 ±0.04. 

4.. The coefficient of correlation between the rating which the stu- 
, dent n«e.yM m his practice teaching apd the actL aT« fiml 

T I? “ilf- ° <=o«ffieient was not significantly 

Ttm^uL ‘ho oon.bined effect of the ‘ 

5 The coefficient of correlation between the ratings in pracUce 
r firet-quarter grades was only 0.36 ±0.06 

an^ M ^ ^ “ hetween 0.80 

fint-quarmr ^es, predicted first-quarter grades and 

JrTt^ considerable yalue in predicti^ whether 

whethe? or“not he “ “hool untU graduftion and 

wnether or not he will obtam a position through the appointment 

bumau of tb. mstitution ju) Only one in sLt of those wKS 

appointmenTb^ur:a"u. ‘he 

^^7"* ectual first-quarter grades or predicted fi^H^re? 

between 3 and 4 were graduated, (c) Fifty-four perTt of 
U>. enWtng g^up th.t receiyed actud firstH,uarter gradL betwjn 
LT Poations. (i) Eighty-two per cent of the graduate, 

who receiv^ actual first-quarter grades between 3. and 4* receiyed 

KrcT^* T f Kfednetee who receiyed ayerage 

W " * *“<' 3 receiyed positior ' 

“‘"‘*e‘“® *ho receiyed actual firet-quartar CTadea or 
p^icM fint-quarter grades between 0 and 1 was gfaduated 

‘hat receiyed firat-quarter*Lades or 
predicted firebquarter gndes between 1 and 2 was graduated 
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8. Ratings of the 1930 graduates who received positions were re- 
ceived from the superintendents and principals. The reliability of 
these ratings has not yet beeh determined. 

9. The rating in practice teaching is the best single index of the 
rating whiqh the student will receive in the field. The correlation is 
very low, i. e., 0.27. 

^10. All the tests, grades, and information used in this study give 
very low coeflBcients of correlation when used to predict the rating 
which the student will receive in the field. 

BIBLIOGRAPHY 
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THE SIGNIFICANCE OF PERSONNEL MEASURES 
AT THE UNIVERSITY OF OREGON 

Howard R, Tatujr aod Clitforo L Constancb * 

When a competent engineer measurea the potential electric energy 
of a etream hia reaijte appear to be final and absolute in so far as we 
are perfecUy fnmUiar with, or trustful of, the units used and the 
mathematical accuracy of their transmutation from second-feet of 
stream How to kilowatts or horsepower. I, reality, of course, the 
measuremenb are purely relative. If you turn to the dictionary for 
a definition of second-feet or watts or horsepower, you wiU be rewarded 
with a translation into other terms, and your search will end only 
when you encounter terras which you understand well enough to 
accept without question or, more often, when you have traced out a 
circular senes of terms which ends at the word with which you started. 
Again, to the naive individual, nothing could be more nfnsensical than 
the engmeer 8 statement of numerical equivalence between the boiling 
white water on a rapid and the predicted magnetic deflection of aa 
needle on a gage m a power house as yet unbuilt. 

Now, the attempt to measure potential college scholarship in terras 
of pereonnel data is essentially a comparison of the same sort but 
considerably more difficult, for two reasons.. First, the units used in 
both sets of measurements are not sufliciently familiar or definite to 
have a widely accepted meaning. Experts in mental testing are only 
•pwtidly agreed as to the meaning of test scores, and college scholar- 
ship m torins of marks or grades is of rather doubtful significance even 
among those most familiar with the symbols in which such estimates 

rr U ' equivalence of each set ef units in terms of 

the other has been very imperfectly established as yet. If we could 

j psychologist's mental tests as measures of intellect, then 
we wuld detemme.the meaning of college marks in terms of such 
mtolligence. Or if we accept college marks as good relative measures 
of the extent to which students /have profited from instruction, as 
md^ many studies of the rough but essential validity of college 
marks strongly suggest, we can then appraise the tests and other 
^rsonnel meaqi™ m terms of such scholarehip. But any gain in our 
knowledge of either characteristic must be purchased by making one 
or the other of these assumptions, and then by evaluating the relation- 
ship which existe between the two sets of measures. In fact, such 


o 
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procedure is implicit in measurement of anj kind. But in the case of 
the power-plant engineer, previous laboratory experimentation has 
already provided a knowledge ofVthe essential relationships between 
his variables, whereas the psychologist attempting to use personnel 
measures must continuously work out these relationships for himself 
while at thc.sanie time he collects his data. 

Obviously the determination of the relation between personnel 
measures ^d the excellence of scholastic achievement would have 
little merit if the test scores and other measures were themselves 
utilized as a basis for giving the marks. For this reason, in addition 
to others equally important, the personnel research bureau has stead- 
fastly restricted the use of personnel data at the University df Or^on, 
especially for the two groups reported here, to conferences sought 
voluntarily by individual students and to the pdi^mistrative analysis 
of the probable causes of scholastic difl^ulty after marks in various 
courses were already assigned. In general, ratings were not released 
to instructors an^were used only under conditions which practically 
guaranteed the mdependence of estimates of scholarship from a 
knowledge of the personnel ratings during the period of this experi- 
ment. ^ 

In September, 19^5, we began collecting personnel information for 
all students entering as freshmen. Of 454 men and 409 women 
entering at that time, 62 men (13.7 per cent) and 110 women (26.9 
per cent) had completed their work for graduation at the end of the 
summer session in August, 1929 — four calendar years^ter entrance, 
In September, 1926, of 455 mcfl and 367 women enter!^^ freshmen, 
68 men (12.7 per cent) and 101 women (27.5 per cent) ht^d graduated 
by August, 1930 — four years later. Thus, just less than 20 per cent of 
those entering theJUniversity of Oregon as freshmen may be expected 
to graduate within the 4-year period following their entrance. It is 
the records of these 331 students who graduated w\j||jn four years 
after entrance which constitute the basic data for IKs study. In 
addition we found that 46 men (10.1 per cent) and 27 women (6.6 per 
cent) — (or 8.5 per cent in all) — of the class entering in 1925 graduated 
sometime during the fifth year after entrance. It is probably safe to 
say, therefore, that not more than 30 to 35 per cent of those who enter 
the University of Oregon as freshmen will graduate from it. Appar- 
* ently this is about what happens at other reputable State universities.* 

* Tbt UnlTcnity of MinDfSoU reports hi iU UD8 roluins on Problems of ooUegs sducstloD/' p, 
ihst 24 per otnt of the entering In 1030 bed ellbtr gredneted or trensferred to e profenioxiel school 

wkMre they bed completed their fourth yser of work et the sod of the normel 4-ymt period. Since our 
flcuree do not Include thoee members of tbe IW sotvlng ciem who bed trsnilKTsd to lew, medicine, or 
erohltecture end completed four yien without teking e dec^t ^ gredoete four 

jeers efUrentreoce et Oregon ere very neerly equlTiieot to the 34 per cent reported el Minnesofe, 
Agein, Toope end Edgerton, of Ohio 6Uto» in their 1929 report of the ** Acedemic pcogiw of students ** ssy» 
p. Ud, thet of ell tbe etudeots who enter tbe unlveretty only ebout 34 per cent wHl probebly gredueU, 
Thii cbeeki very doeely with our estimete of 30 to 36 per cent w^ will grednete, beeed on tbe 
Ibet 28.4 pw Mt bed gredueled 6 yseis efter eotnmoe. 
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Now, if one thinks of the attainment of a degree as the me yua rumof 
university training, this elimination of fully two-thirds of those' who 
enter as aspi^g freshmen will seem to constitute a major educational 
tragedy, mitigated to some extent by the rather small percentage who I 
complete their coUege work at some other institution, presumably j 
better adapt^ to their needs and also mitigated by the fact that some I 
of th^e students have gone on into professional schools, such aa | 
"ci^^cture, without taking a general univemity ' 

On the other hand, if a major function of the universitv is to furnish 
selective guidance and assist students in the inevitable trial and error , 
procedures of finding their real aptitudes and interesU, extensive 
eli^auon inay merely evidence the inteUectual standards of the 
university and its efficiency in shifting students out of unpromising 

"f-TT' ^ to think of coUege experience 

which do^not^. eventuate m a degree as time, energy, and money 

thrown away. It may well be that the benefit from coUege training 
per se is m many «as^ great for those who do not graduate as for 
a good many of those who do, although in general the prestige and 

personal satisfaction atUched to the degree make it weU worth 
stnving for. 

But whatever the essential nature of the educational processes 
leadmg to graduation-whether they be primarily instructional or 
selective-we have «o scientific basis for advisingand guiding students, 
nor even for rcorgamzing and improving our scholastic procedures 
untU we know the meamng of our personnel measures in terms of the 
fundamental objectiv^ of coUege education \^hich ostensibly, rft 
least, are scholarly achievement in various fields of knowledge, ^at 
can the results of an hour or two spent in puzzling out complex mental 
tasks of an unfamiliar and apparently quite impractical sort possibly 
foretell about scholastic achievement in say 50 or 60 courses under 20 
or 30 different instructors in three or more broad fields of knowledge 
requinng normally four years of fairly consistent study? To the 
person unfamiUar with the great variability of coUege students in 
mental capacity and unaware of the surprising stability and unitary 
nature of psychologic^ samph'ngs of general abiUty, the expeptation 
of any connection at aU between such variables seems nonsensical, 
but that IS exactly the reason for resorumg to experiment. i 

The psychological examination of the American Council on Educa- 
Uon, which we have given each year beginning in 1925 to all entering 
freshmen, is a general abiUty or general college aptitude battery of 



a. the unlvertity editor, in • follow-up ot 
19W -80, 20 per oeol of tboee who replied 
Mwbere, whjle 30 per oeot pfato to reeoUr 
lUriMd replied. It ii hard to cay Jtut bov 
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the type commonly called an intelligence test. It samples the ability 
to do complex mental tasks where previous training as a factor in 
success has been intentionally mini mized. In 1925 it consisted of 
eight tests for which an hour and a half of working time was allowed. 
In 1926 only seven tests were used, and it has since been reduced to 
five tests with an hour of working time without reducing its effective- 
ness in predicting scholarship. (See following table.) 

Evidently the five tests which have constituted the backbone of the 
battery each year from 1925 to the present give an adequate sample 
of such general abilities as are scholastically important. The relia- 
bility of this battery by the split-halves method for the series of 1925 
and that of 1926 we foimd was ri/“0.95, and this value has been 
extensively corroborated elsewhere (8).* But we are concerned here 
with the stability of such a measure over a long period of time as well 
as with the extent to which the same abilities are sampled by two 
similar forms of the test. In April, 1927, we retested with the Ameri- 
can Council series of 1926, 93 students who had taken the American 
Council series of 1925 in September, a year and a half earlier. Thus 
we sampled the relative stability of such scores after five quarters of 
university instruction. The correlation of total weighted score on 
the two independent but similar test batteries was ri/=0.90. While 
the number of cases (A^ = 93) is small, in comparison with our other 
data, I feel quite confident that the coefficient ril=0.90 does not over- 
state the extent to which American Council test scores represent a 
fairly unitary and stable aspect of the potential capacity of college 
students. In order to facilitate e.xplanation we transmute all our 
personnel data into percentile ranks (P. R.), indicating the percentage 
of students entering the University of Oregon as freshmen who made 
scores lower than the particular score under consideration. The 
percentile ranks of American Council test scores for our Oregon fresh- 
men, particularly in the upper ranges, approximate veiy closely the 
national norms for 1925 based on 16,000 students ini 55 different 
colleges (1). While we find empirically that correlations determined 
from measures recorded in percentile ranks are slightly lower than 
where the oiiginal measures are used — as is to be expected theoretically 
— the shrinkage in relationship is n^ligible and is more than offset 
by convenience. Thus it is general ability measured in such percentile 
^ank units as we find most convenient to use which we wish to com- 
pare with scholarship. 


* Numb«n Id parentbMM refer to " BIMiogrtphy/’ p. 49. 
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Series of- 



1 

1825 

1 

1 1025 

1827 

1828 1928 1 

1 

1 

2 

3 

4 

CD 

.*<5 

r loUl 

ai2s 

1 

1 

a 4^ 1 a489 1 

442 ' 

MeD.._ 

Women 


0. 440 1 
.345 1 

0. 4W 
,558 


N total 

KfAn 

iso 

970 * 80S 

420 

1 388 

Gross , Gross 
r. R. r. P. R 
G P R. 0. P, R 

1 

mcu . - . . 

Women 


438 470 1 

^ 364 1 302 

Gross ' Gross i 
P. R. F. 1 P R. r. 
grade O. P. R, • 
avera+e 

1 

Variables 

Median 
P Ri 0 
grade 
points ' 

1 


1830 


0.511 

.454 

.570 
800 
433 
307 
Gross 
P. R. V, 
O. P. R. 


' P R. meaMperceniUe rank ««xplalD#d Uter. and O. P. R. means grade-point ratio as explained later, * 


But the question of the numerical representation of scholastic ex- 
cellence IS an indispensable preliminary. Instructors evaluate the 
work of studente at the University of Oregon under six catagories. 
If grade III (slightly above average) and grade IV (slightly below 
average) be combined into an average group, the scale correspondTl^ 
the 5-step scales more commonly used elsewhere. If the gra^’T^ 
were assigned to 5 per cent, II to 20 per cent, III and IV to 50 per cent, 
V to 20 per cent, and VI or F to 5 per cent, the distribution of grades 
would approximate the normal curve which has empirically been 
found to repr^ent very weU the actual distribution of grades in sev- 
eral i^ersitiea where the performance of large groups of students 
over faniy long periods of time has been studied (2, 3, 6, 7, 10). At 
Oregon the actu^ percentages of each grade assigned during repre- 
sentative fall, winter, and spring terms combined were: 1 = 8 8 

11=23.7, III and IV = 53.5, V = 8.9, and VI or F = 5.0. ' 

In view of the host of diverse and fairly independent factors enter- 
into quality of achievement, the assumption that it is norraallv 

^tributed is reasonable on a theoretical as weU as on an empirica’l 
basis. 


If now a unit normal distribution be divided into areas proportional 
to the percentages of students assigned each grade, we can compute 
the distance above and below the mean, where the mid-point of each 
suQh area intersecte t^e base line. The relative distances of each such 
point, m unite of variability from the mean, may be considered nu- 
mencally comparable measures of such differences in the quality of 
work as instructors can recognize and value. In standard doviation\ 
umts, these distan^ corresponding to the various percentages*of | 
grades actually assigned at Oregon are 1= +1.81, II* + .86, III - 
+ .04, rV — — .69, V — — 1 .27 , and F — — 2.40. Since negative weights 
are awkward to use in actual practice, we can preserve the same math- 
ematical relktionships by adding 2.40 to each weight, giving 1-4 2 
II-3.26, III-2.44, IV-1.71, V-1.13, and VI or F-0. This indi- 
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cates that the traditional, numerical weighting of grades long used by 
the registrar’s oflSce of 1 = 5 points, 11 = 4 points. III =3 points, IV — 

2 points, V “ 1 point, and VI or F = 0, is not greatly in error. Actually, 
in terms of the frequency with which they are assigned, it overweights 
high grades a little, and underweights the low grades somewhat. 

A priori it would seem that grade points (number of hours earned 
times the weightmg for quality) would give the best index of student 
achievement since it mcludes both quantity and quality of work. 
But the university sets up a definite, number of hours which each 
graduate must obtain. Hence it would be necessary to use 4om^ such 
measure as average number of grade points per term if both factors are 
to count. Now in general, the factors contributing to variability 
in hours carried are so diverse, e. g., seU-support, student activities, 
health, etc., and the variations in load carried so supervised with reference 
to the needs of the individual, that we have preferred to measure collie 
scholarship in terms, of quality alone. Again quantity and quality 
of achievement appear to be psychologically very different aspects of 
a student and certainly the negative correlation between quality and 
quantity of work implied in the combined index is quite the'reverse of 
the facts. Finally, there is a widespread and growing conviction that 
quality of scholarship is w’hat needs emphasis in American higher 
education. Hence total grade points (hours multiplied by the tradi- 
tional w'cighting for quality) have been divided by the total number 
of hours to give average grade points earned per hour for which the 
student registered. Since VI or F (failure) counts zero points but still 
counts as hours, all failing gradfes lower the quality index in the proper 
proportion. This index of the quality of college achievement which 
w'e call the grade-point ratio (G. P. R.) will be used as the measure of 
scholarship. 

Our studies of the reliability of such grade-point ratios at Oregon 
show that the correlation of three alternate quarters of college work 
measured in these terms with three comparable quarters of similar 
work is r,f,F = 0.89 (V=396). Thus we can estimate the reliability 
of our 4-year grade-point ratios as not less than ruiijr“0,97.* Evi- 
dently the general factors which underlie pooled estimates of the 
quality of coUege work are remarkably consistent and stable. Thus 
the reliability of averaged grades is much higher than one might expect 
from the well-known inaccuracies of grades in single courses. 

Probably the simplest method of representing the relationship be- 
tween two variables is to plot a straight line showing the aYcrAge 
amount of change in One variable which corresponds to an average 
amount of change in the other. With test data this is usually done 
mathematically computmg the correlation coefficient between the 

• Th0 8p6iniMQ-BrowD propbicy formulA ii widely iited In this cooMotloo end tbtre ii oonsldirmbU 
enpArioel evldtfiot of lU Talidily for fuob dete. 
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two variables. Of course, relationships may erist which can not be 

® “ obtained corree 

thsrno ''■“‘ionship eTOts but merely 

that no str^ht hue representation of whatever relationship evists b 

^ **“‘^*‘ ‘*‘® “"“orioal value of r approaches ± 1 
onfv?°*i,T"’ ‘be extent to which a linear estimate of the 

sire *K*f' j'ble from a knowledge of the other. Jlence the 

size of the obt^ed correlation coefficient between two variables in 

^ oveT'’° '"‘f P‘"bable error, is indisputable evidence 

of overlapping factore or i^ntical elements in the two sets of meas- 

atOre^in a7el‘ru^;'’"‘"''" -“olarship 

year C. p” /»“r- 
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1929 
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7 
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WoEnen 



0. 489 
.616 

58 

1 

1 

a 496 

120 

0. 487 

Total 



101 

i 

.532 

211 

. j 

.576 




.561 

150 

1 .489 

331 

.526 


Two th^ m tto table are especiaUj worth noting: ( 1 ) Contrary 

Ibirfh 1“*® ’’«‘>»vior of women is somewhat more predict 

b t™? n f “*“• “h°‘»vebip can be estimated 

m ^rms of test More more accurately than can that of the men-r- 

0.68 as compared with r-0.49 for men. (2) According to the most 
conservatjve mterpretaUon (9) there is not less than 28 per cenJ 
M^onahty between general ability measured by an hour and a half 
spent m workmg at complex mental tasks and the exceUence of coUego 

ro*‘nlY!tlS Of course our t^ 

“*“T P^Pi^b®". “‘udious habits, nor scholarly ' 
xe J MMpt as Uese may be correlated with general abUity. Such char- 
Mtenstics mth many others probably constitute the other factors 
(not more th« 72 per cent) in Mholarship which do not vary con- 

meM^^^bfer ‘“t®™ •!«> 

Even with the advant^e of great accuracy in his separate measure- 

a^m r *® K**® ‘b® P®‘®“‘i»l «"«rgy of a 

s^ by a single measurement. He needs a more repree^tive 

‘be y^ly and year to year ups and downs of the river, 
we need a measure of student preparation and earnestness 
spread over a period of years. The prevailing practice in regard to 
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entrance requirements assumes that performance in preparatory 
schools is - prognostic of college performance. Yet due primarily, 

I think, to variations in the standards of grading in different high 
schools, average preparatory school grades havh usually given very 
slight evidence only of college success. However, in our personnel 
research at Oregon we have always considered the evaluation and 
use of information already on file, such as preparatory school records, 
quite as important as the collection of additional information such 
as psychological test scores. Accordingly, we began in 1925 to com- 
pute for all entering freshmen an index of the quality of preparatory . 
school work. No doubt a good objective teat of high-school knowl- 
edge and skill, such the Sones-Harry or Iowa High School content 
examinations, would be even more useful, but over a period of six 
years our index has earned equal weight with general ability test 
scores in a regression equation for the prediction of college scholarship. 

Without describing the several variations in our procedilre leading 
to better prediction and easier computation, the essentials of our 
method are as follows: We empirically equate the various high-school 
grading systera^nd then weight heavily each unit of credit rated in 
the highest s^B interval. This of course makes the index depend 
chiefly on such work as was recognized by the high school as outstand- 
ing. This index is also transmuted into percentile ranks for conven- 
ience m explanation. Thus it means that such and such a percentage 
of our entering freshmen had preparatory school records below the 
one under consideration. 

We have no satisfactory method of determining the reliability of 
this measure as yet. However, the principal of each school was 
requ^Red on the entrance blank to place each student in the first, 
second, third, or fourth quartile of the graduating class. If we 
assume an approximately normal distribution of ability in graduating 
classes with roughly equivalent means, these categories can be trans- 
muted intdtoumerical measures by assigning the appropriate standard 
deviation ^ues to the percentage of freshmen falling in each quartile. 
For these 1925 freshmen, the correlation of the principals’ quartile 
rating with our empirical index of the quality-of preparatory school 
work was r = 0.62 for 248 women, and r=0.68 for 174 men, or r = 0.66 
for 422 cases. For the 1927 freshmen similar computations gave 
r-0.77 for 320 women, r-0.71 for 372 men, and r = 0.73 for 702 cases 
in all. None of these coeflScients can be considered a satisfactory 
reliability coefl5cient, but such a coefficient would hardly be higher 
than r,/“0.7, which would therefore be a conservative estimate. 

' Fortunately for our purpose preparatory school records are not 
> closely related to general ability test scores. For the 331 graduates 
( of this study the correlation is only r-0.44 (r«=0.48 for 120 men, 
and r = 0Ml for 211 women). Thus using the conservative ij^ter- 
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pretation previously referred to, there may be as Tittle as 19 per cent 
commonality between these measures of preparatory school work 
and the general abiUty test scores. Among many possible explana- 
tions, our experience suggests that this is because effort, i. e., docility 
dependability, cooperative attitudes, and persistence, are more im- 
portant factore in preparatory school success than is all-around 
mtellectual ability. Especially does our experience with mdividual 
personnel records support such an interpretation. When a student 
enters ynth a low test percentUe rank (P. R.), but high high-school 
percentile rank, we usuaUy find a hard-working, serious-minded i 
mdividual whose difficulties are chiefly slow learning, or inabUity to 
appreciate abstract generalizations or to display critical insight. 
On the other hand, freshmen with high test percentile ranks and low | 
high-school percentUe ranks are bad college risks because they haVe 
formed habits of loafing and just getting by in place of efficient study 
habits. Their c^culties are chiefly inadequate preparation and the 
surpnsmg stability of habits of superficial thinking and of dawdling 
over scholastic tasks. ® 

The relationships of high-school record to 4-year coliege scholar- 
ship are as follows: 


Table 2.—Four~year itudy nhowing correlaiiont between high-eehool percerUilu 
and 4-year grade-point ratioe, all departments induded 
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In Table 3 the results of combining test percentUe ranks with high- 

school percentUe ranks in the prediction of 4-year scholarship are 
shown : 


Table 3.— Four-year study showing correlatiom between average of high-school and 
i69i ranks and I^-year grade^point ratio 
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From Table 2 we may conclude that in spite of the inaccuracies 
involved m equating bigh-echool marks, an index can be derived from 
them which is approximately equal in predictive significance over 
the whole 4-year college period to that of American Council General 
Abmty Test scores. Turning this finding about we may say certainly 
that the logic of basing admission to coUege on preparatory school 
records applies with equal force to psychological test scores. 

In two respects at least the fatter are superior, (a) They are 
higUy objective and hence impartial. Since special training is 
mummzed as a factor in making high test scores, all students are more 
nearly on the same basis than is the case with preparatory school 
records with their inevitable variation in standards of grading. (6) 
The psychological test furnishes an adequate sample of general 
abihty m an hour and a half or less, while it takes three to four years 
of observation in preparatory school at least as at present recorded 
to give equal predictions of college scholarehip. 

^ From Table 3 the supplementary value of preparatory school records 
IS made clear. When high-school percentile rank is averaged with 
twt percentile rank, the correlation with 4-year college scholarship 
nsw to r-0.62 within the rather restricted range of ability which 
achieves graduation in that time. Thus in this combined measure 
we have at least 38 per cent commonality between personnel ratings 
^d college scholatship — a gain of at least 10 per cent in unique 
factors over either pereomiel measure alone. 

We have also made an extensive analysis of the scholastic signifi- 
<^ce by ^partments and schools for each of the five tests in the 
Amencan CoimcU battery. Our data hardly justify the use of Spear- 
man s mathematical techniques for determining to what extent thf 
relationships found between separate tests, preparatoiy-school records, 
and 4-year coUege scholarship may be thought of as due to the presence 
of a single general factor plus many specific factors. But there is a 
strong suggestion that the burden of predicUon for each of our meas- 
ure 18 earned by some unitary set of factore which is sampled again 
^d agam rather than by the teaming up of independent groups of 
factors ronstitu^ unique traits. All our departmental correlations 
^ ^sitive and appear to be in hierarchical order ranging down 
towaro zero for certain tests in certain departments, but never being 
sipi^cantly negative with scholarship in any department. Again 
the mtereo^lations of each separate measure with any of the others 
and with tie entenon of 4-year scholarship (grade-point ratio) is 
always positive as appears below. 
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TabLiB 4 . — F ouT-ytar tiudy thawing inlercarrelaUont of 8 dspendmt txiriablet with 
the criterion 4-year tcholarthip (1996-1929 and 1926-19S0 groupt combined) 


M«n AT- 120; 02 on ^Alogl«s 

Women. .....AT^Qll; 110 on analogies 

Total. .>^-331; 172 on analogies 
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Xt 
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a 534 
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a 304 
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0.338 
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1 

1 
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But when these measures are teamed up by multiple correlation for 
the prediction of 4 -year scholarship, the unique contribution of each 
separate measure diminishes considerably as the overlapping general 
factors included in each measure are appropriated by those most 
heavily saturated with them at the start. This is well shown by the 
partial correlation coefficients of the fifth order for each measure 
with the criterion.* r„.«CT- 0 . 49 ; r,4.2«7i-0.35; r,4.,«7i-0.28; r, = 

0.18; ri7.j4e«=«0.00; rii.j4««7™0.15. 

Thus we may say that each tost in the American Council battery 
contributes effectively to our knowledge of certain general factors 
underlying 4 -year scholarship. High-school record samples these 
same general factors to some extent and adds unique factors which 
make such records equal in predictive impiortance to the 5-test scores 
combined. But the refined statistical procedure of determining the 
weight of each separate variable in a regression equation for predicting 
scholarship makes a n^ligible improvement in prediction over the 


• Taat pmutlk (Twtobb X» ol Ttbto 4) hM bMp omltM from tiw mulUple ooinlotioti bocouM It !• 
iiMnly Um total of tbo Bvo mbtofti (X4, X«, X4, X», and X»). Ukowlao flnt-t«rm grado-polnt ratio (X*) 
hai boon omlttad booauM It to a part of tba oritorion (Xi). 
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simple procedure of averaging test percentile rank with high-school 
percentile rank. . 24 M 7 s = 0.65 and =»0.62. 

A laborious search for additional measures of predictive significance * 
has been rather disappointing, 

1. The Inglis test of Knglish vocabulary at the college level gives 
fwr correlations with scholarship but adds almost nothing to predic- 
tion, because it is correlated r=0.4 with *high-school record and 
r = 0.7 with American Council test score (A^=800). Thus it furnishes 
little unique information about students and raises the multiple 
correlation of our personnel records with scholarship in the third 
decimal place only. 

2. Division of test performance into linguistic and quantitative 
abilities in spite of the light it seems to throw on the intellectual 
make-up of individual students does not add appreciably to prediction. 

3. We have tried out a homemade test of ability to take notes, with 
an added feature designed to indicate persistence. Neither measure 
gives any important new information about the potential scholarship 
of entering freshmen. 

4. We have found the difference between tost percentile rank and 
high-echool percentile rank very revealing in individual cases, but 
these differences apparently do not represent any single trait in Ijnear 
fashion, hence fail to improve general prediction. 

5. Health records and^data from the physical examination made at 
entrance give little or no indication of relative scholastic achievement, 
although they are certainly very useful in advising with individual 
students. 

6. Measures of interest of the extrovert-introvert type and esti- 
mates of time spent in study show considerable promise as indications 
of scholarship, but we have been unwilling to use them systematically; 
because as soon as sttidents realize that mere statements of their 
attitudes and habits are to be used administratively, the frankness of 
such statements becomes doubtful. 

7. After discovering two boys in college whose elementary school 
preparation as measured by Stanford Achievement barely ^equalled 

* that of the average seventh and eighth grader, respectiyely, we share 
the belief of Dr. Luella Pressey (5) that a representative sample of 
elementary-school tool skills might afford considerable predictive as 
well as diagnostic information about college scholarship. However, 
we have as yet developed no test for such a purpose. 

Instead, we have extended our sampling of student performance 
into the college situation. Does scholastic performance during the 
first quarter in college mean !any thing with reference to 4-year scholar- 
I ship? Advocates of freshmen week and sentimentalists in general 
picture the terrific adjustment problems of students away from home, 
lost in crowded classrooms, lonesome and bewildered by the details 
of new social and intellectual requirements as if performance under 
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such conditions were quite unrepresentative of real potentiality. 
But if 4-year college scholarship be considered an adequate criterion ‘ 
of scholastic potentiality, first quarter grades are on the contrary ' 
highly indicative (r-=0.78 for these 331 students). Of course this 'j 
correlation is to some extent spurious in that first quarter grades are 
averaged into the total with which they are correlated, but this is not 
spurious for our purpose since it is merely the earliest possible predic- ^ 
tion of total scholarship which we are seeking. But the high correla- 
tion of first quarter grades with the average of test score and high- 
school record (r = 0.57) prevents our getting as much unique infor- 
mation from these first quarter grade-point ratios as might be antici- | 
pated. Using these three measures, test percentile rank, high-school | 
percentile rank, and first quarter grade-point ratio, the multiple 
correlation with 4-year college scholarship is f?i.ja = 0,74 for 120 
men, and R,.„=0.83 for 211 women, and /?,.a = 0.81 for the 331 
students who graduated four years after entrance. Thus these three 
personnel measures have at least 64 per cent commonality with the. 
quality of 4-year college scholarship. Still more important, the 
reliability with which the common factors prevading all three person- 
nel measures, and also underlying all-around scholarship, are sampled 
is sure to be much higher for the combination than for any of the 
measures singly. Hence injustice to individual students is greatly 
minimized by using such a combination. 

This demonstration of the equivalence in meaning of college scholar- 
ship |nd personnel measures is not complete for two reasons. In the 
first place, it seldom pays to run every speck of potential energy 
through the power house. It is usually better to allow many streams 
whose “head” of potential energy is slight to flow off into other 
channels where they are really much more serviceable. Now by com- 
paring the variance (<t2) in all these personnel measures for the whole 
entering group with the variance in these measures for those who 
graduated, it is possible to estimate what the correlation of the 
measures ^th 4-year scholarship would have been if the total range 
of entering talent had continued or been allowed to continue under the ^ 
selective processes I'epresen ted by scholastic grades (4). Thus the . 
estimated correlation of test scores alone, with 4-year scholarship if i 
all students entering had remained that long, would be r = 0.62 instead | 
of r*=0.53 within the restricted range of those who graduate. Simi- 
larly, the estimated correlation of our combined personnel measures ' 
with 4-year scholarship if all who entered remained, would be r = 0,87 1 
instead of r = 0.81 within the restricted range of those graduating. ' 
In the second place, even for engineers estimated input and obtain^ ' 
output never balance exactly because there is always some error in 
measurement. So in estimating scholarship from personnel measures, 
both the criterion ^d the measurements upon which the estimates 
are based are by no means perfectly accurate. But with a reliability 
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coefficient of ri/“0.97 for the pooled grades of the criterion and 
probably about r|/=0.95 for the combined personnel measures, the 
estimated true correlation of potential and actual achievement would 
not rise greatly above the est. r*=0.87 found by allowing for restricted 
rango.^ 

Yet, surely this makes it evident that early in a student’s collie 
career we can gage with surprising accuracy the excellence of his 
future scholastic achievement. Unmeasured factors, even of interest 
and effort, can not be highly important. Apparently such interest 
and effcyt factors as really function are included rather completely 
in preparatory school records and first quarter graded. Thus we have 
measures of potential scholarship which indicate very well what the 
actual achievement in knowledge, technical skill, . and productive 
scholarship "^mi be under the transforming influence of university 
instruction. To be sure this in no way disparages the necessity of 
contmuous painstaking college instruction as superficial thinkers have 
sometunes fallaciously argued. One might as well argue the futility 
of the power house in transforming potential energy into kilowatts. 
It is only because of exacting mstruction that potentialNB«holarship 
forecasts so definitely future scholastic achievements. But whatever 
• the nature of the general factors which underlie all-around college 
scholarship, we can be sure our test scores and other personnel meas- 
ures indicate potentially these same general factors. If the pooled 
judgments of instructors with referenc^e to ^holarship have value, 
that same value applies to test scores and other personnel measures. 
Or if test scores be accepted as measures of intellect, then collie 
instructors as a group recognize and value such ability very definit^. 

The practical significance of these findings can be illustrated by an 
experiment in correlating estimated scholarship with that actually 
obtained at the university. If the average of test percentile rank 
and high-school percentile rank be combined with first quarter grades 
in the university, the composite correlates r = 0.825 with 4-:year 
grade-pomt ratio for the 172 students who graduated four years hfter 
entrance m 1925. Using the same weightings for these three person- 
nel measures, we estimated the 4-year grade point ratio of the 159 
students in the 1926 entering class who graduated four years after 
entrance. We then correlated the estimated grades with the g^ade- 
point ratio actually obtained by these students during their college 
car^^^ This correlation was r* 0.806 which at the same time verifies 
‘our computations of the previous year and demonstrates the' essen- 
tially stable significance of our personnel measures from year to year. 
Finally, we have used the same measures and weightings to estimate 
the average scholarshi^j^^^ty^cally aU those entering the^niversity 
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^as freshmen in 1925 who stayed as long as one quarter.* These 
estimated grade-pomt ratios were then correlated with the average 
points per hour actually earned ior as long as the student remained 
or was allowed to remain. The result was r = 0.86. It will be remem- 
bered that the estimated correlation of these measures with 4-year 
scholarship in such an unrestricted range was r = 0.87. Again the 
consistency of ouf da^ and computations is gratifying. 

And now, briefly, what should be done about it? Well, so far as 
average quality of college scholarship is on adequate criterion, 
preparatory school records and psychological test scores are valid i 

bases for advising very mediocre students not to attempt college 
work. Such an evaluation of college potentiality for each prepara- 
tory school student is feasible during his final high-school year. 

All that is needed is a uniform State-wide testing program and the 
pij^paratory school record. But if educational democracy requires 
an open door in State institutions for all high-school graduates,^ 
then certainly fairly rigorous elimination may begin at the end of 
the first quarter without any seriofls injustice and with great savings 
in time, money, and effort, not to say agony for all concerned. The 

following chart presents the evidence for this statement in graphic 
form. 

Suppose in 1925 we had placed on probation at the end of the first 
quarter all students for whom our combined personnel measure^ 
predicted a 4-year grade-point ratio of 2.5 or less, and then dis- 
qualified all who failed to maintain a grade-point ratio of 2. Twenty 
per cent of the entering class would have gone on probation, but ' 
less than 3 per cent of our 1929 graduates, and only 11 per cent 
of those who graduated in five years would have done so. Pre- 
8umably\not a single graduate would have been disqualified. So 
far, students with grade-point ratios less than 2 simply do not 
graduate. Everything seems to indicate that the line between 
satisfactory and unsatisfactory scholarship at the University of 
Oregon might well be drawn at this point. Of course, students 
failing to earn grade-point ratios of 2 in any quarter should also 
go- on probation and be disqualified if they do not improve, no matter 
how promising the scholarship indicated by their personnel records. 

A scholarship committee considering individual cases on their merits 
could easily furnish all the elasticity in administration which w6uld 
be needed. The chirf advantages of such a system would be : 

1. When probation and disqualification are based on single quarter 
grades alone, as at present, the reliability of such judgments is quite 
low becausereven when the grades of a single quarter are averaged, 
the reliability of the resulting index is, in gener aly r|/»»0.73. 


• Wf uUUud bm dftU ooOwtMl by the regtotm. Dr. P«UeU, foc.anotlMr p« Uok of ottNr Infar. 

matioo ib^t them ttud«alB mnd withdrawal during tb« flnt quarter explains why 7fi3 instead of the whole 
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and much lower for possible combinations of departmental grades, 
to say nothing of the dubious reliability of a single-course grade, 
which often decides the whole matter. But when probation is based 
on potentiality estimated from personnel measures, including first- 
quarter grades, both the reliability and predictive significance of 
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such judgments in terms of 4-year scholarship are greatly improved. 
The same alignment applies to basing disqualification on cumulative 
grade-point ratio 'Instead of on the scholastic performance of a single 
quarter. - 

2. If probation were in part^ased on general ability test score, 
it would tend to stimulate each entering student to do his best on 
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the psychologies examinatioD, thus insuring more accurate measure- 
ment. 

3. Likewise, if probationary status depended in part on preparatory 
school record or high-school content test score, there would soon be 
a salutary eflfect on the attitude of high-school students toward their 
preparatory school work. 

4. Finally, scholastic leniency with freshmen is unjustifiable in 
view of the high correlation of first quarter grades wath 4-year record. 
Such leniency encourages excuse making and refusal to face the facts 
so far as scholarship is concerned. Under the proposed more rigorous 
requirements students with real potentiality would be “put on their 
toes“ at once instead of being confirmed in habits of preparatory 
school loafing. There is no better time to break with inefficient 
habits than when changing from accustomed surroundings and 
associates to new ones. As William James said in his famous essay, 

“a complete break is far better than a gradual one.” For students 
without any scholastic future, an early recognition of that fact will 
go far to prevent a costly and tragic struggle against probability 
into which veiy mediocre students arg lured by lenient scholastic 
requirements for freshmen. What we need is less sentiment and more 
psychology in freshmen week and personnel procedures generally. ^ 
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A STUDY OF THE COLLEGE APTITUDE AND ABILITY 
OF HIGH-SCHOOL SENIORS 

John S. Jordan ‘ 

I. INTRODUCTION 

The survey reported herein originated in the desire to obtain cer- 
tain specific information about the students coming to the Washington 
State Normal School at EUensburg as compared with high-school 
seniors in general in the State, and as compared with those high- 
school seniors going on to other institutions of learning. 

Some of the principals and superintendents of Washington have 
complained from time to time, perhaps rather uniquely, that our 
graduates are not alw^ays phenomenal teachers. We in turn have 
complained that the high-school seniors sent to us by these same 
superintendents are occasion^ly lacking in some of the academic 
virtues. This survey was suggested at a meeting of principals and 
superintendents of Yakima County, Wash., at which a representa- 
tive of this school was present. The futility of the expression of 
personal opinions upon such matters was evident to many of those 
participating. The survey was, therefore, undertaken with the hope 
of supplying some objective evidence bearing upon the issues in- 
volved. 

The writer wishes to express his appreciation of the fine courtesy 
and cooperation of the superintendents and high-school principals of 
Yakima County. Without this spirit, the study reported below 
would have been impossible. In many instances the giving up of 
half a school day for the testing program must have been inconvenient. 
The clerical labor necessary for the recording of marks and other (Jata 
by administrators or their assistants was a burden which was cheer- 
fully assumed by most of those involved. An attitude of this sort 
is a hopeful indication of future progress in educational investigations. 

II. PURPOSES OF THE SURVEY 

1. To supply information or data of value to superintendents and 
principals for use in the administration of their respective high schools. 

2. To discover the rapge and status of college aptitude in a typical 
agricultural county of the State of Washington. 


I Joho 8. Jordan, bead of the dei>artmeDt of psycbolofy. State Normal Bohool, EUeneburg, Waah. B. A., 
UniTmlty of Denver, M. A«, Stanford University, ira. De was formerly a member of the Cacnlty 
of tbe Department of Pfyeboioffy and Edacatlon of Colorado CoUege. 
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3. To determine the relative status of students entering teacher- 
training institutions. ^ 

..4. To determine the relative aptitude of the men as compared with 
the women students. 

5. To determine the relation between high-school marks in several 
departments and aptitude test results. 

6. To determine the relative values of high-school marks in the 
earlier years of high-school work, as compared with marks obtained 
in the last two years and the last year. 

7. To discover the relative college-aptitude status of the seniors- 
of small, medium, and laige high schools. 

8. To determine the relation between college aptitude and the 
degree of acceleration or retardation in the elementary school and 
the high school. 

9. To determine the relationship, if any, between college aptitude 
and varying proportions of rural and town elementary schooling. 

10. To discover the relationship between college aptitude, as meas- 
ured by the tests used, and the occupation of the father. 

11. To discover the relationship between the prediction inde.v and 
the occupational intentions of the high-school seniors. 

12. To determine the relationship between the prediction index 
and the educational intentions of the seniors, with separate analyses 
for different types of intended higher institutions, and for different 
sorts of training, such as liberal arts, engineering, teacher-training, etc. 

13. To follow up, from the standpoint of the results obtained, the 
seniors who may enter upon their fihst-year course of preparation 
for teacl^g. 

It is realized that many of the above-mentioned objectives are im- 
perfectly accomplished in the study. This is due to many causes, 
among which are, smallness of sampling, inadequate methods of 
measurement, and limited means for securing data of a reliable sort. 

III. PROCEDURE 

1 . The battery of tests, <l^hich is used at the normal school for all 
incoming students, was administered to the high-school seniors at 
the following schools of Yakima County: Cowiche, Granger, Lower 
Naches, Mabton, Moxee, Naches, Outlook, Selah, Sunnyside, Tieton, 
Toppenish, Wapato, Yakima, Zillah. The total number of students 
tested was 458. This group of tests has been chosen for use at the 
normal school, after considerable trial and experimentation, as 
meeting the needs of a teacher-training institution. It is probable 
that some other combination might be better for general college 
purposes, but it was believed that tho advantage of comparing the 
performances of the high-school seniors with those of incoming 
normal-school students would more than offset any possible disad- 
vantages. The tests used were as follows: 
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(a) The Detroit Advanced Intelligence Test is a variegated collec- 
tion of performances representing a wide sampling of functions of a 
sort which most high-school students have had an opportunity to 
learn. The test is not based upon specific units of subject matter, 
but is quite general in nature, and is rather t.ypical of the sort of 
instrument commonly called a group -intelligence test.” The 
Army Alpha was a pioneer test in this field. Such tests should not 
be thought of as direct measures of innate capacity, nor as a dis- 
crete measure of “pure intelligence.” The object of its use was to 
obtain an all-around measure of college aptitude. Evidence from 
schools using such tests indicates that they are of some value in pre- 
dicting college success as represented by academic marks, 

(b) The new Stanford Arithmetic Test is a survey test including 
two subtests, one on the fundamental operations, and one for reason- 
ing problems. The problems are representative of those commonly 
offered in the public schools in the grades through the ninth. All 
students in the Washington State Normal School, preparing for a 
teacher’s diploma, who fall below the ninth grade norm, are required 
to take remedial work without credit until the deficiency is removed. 

(c) The Iowa Comprehension Test contains selections of material 
similar to that found in many college textbooks in the fields of science, 
history, and literature. Fifteen questions are provided for each of 
the three selections. The test is intended to measure the ability to 
read understandingly such material at a fair rate of speed. 

(d) The Purdue English Test covers certain fundamentals of Eng- 
lish as needed in everj'day life and particularly by teachers. The 
topics covered are punctuation, grammar, choice of^words, literary 
infonnation, spelling, vocabulary, and reading. Normal-school stu- 
dents falling below mx empirically determined standard are required 
to take remedial woi^ in English without credit. 

2. The tests were administered, with one exception, by J. S. 
Jordan, of the psychology department, who is in charge of testing at 
the normal school. The tests take almost three hours to administer. 
The entire morning or afternoon session was consumed in each 
instance, with a 5-minute intermission at about the midway point. 
The standardized directions and time limits were adhered to abso- 
lutely. 

3. 'fhe teste were scored by advanced normal-school students who 
were paid for their work, and who were imder close supervision. 
Each test was scored independently by two different people. The 
standardized scoring directions were strictly adhered to. 

4. A prediction index (P. I.) was computed for each student. 
The prediction index is a composite derived from the scores of the 
entire batteiy of tests. In arriving at this composite, the scores are 

^ weighted in proportion to the amoimt contribute by each test to 
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college marks. The weights were based on data from about 600 
normal-school students. The contributions are determined by the 
multiple correlation method applied to the marks as the criterion and 
the test scores as the variables. Weightings are also included for 
differences in absolute size or score. The prediction index is a means 
of expressing by a single number, a prediction baaed upon standardized 
test results, as to a student^s probable success in college work. The 
weighting of the prediction index is computed so as to yield an aver- 
age of 100. The evidence is that the prediction index is more accu- 
rate than any single test score or any unweighted combination. The 
multiple correlation obtained between the prediction index and marks 
for the fall quarter of 1929 was 0.792 with a probable error of 0.017. 
This multiple correlation coefficient is the highest possible correla- 
tion between the composite of the test scores, each one weighted 
in optimum manner, and the dependent variable, namely, normal- 
school marks. The prediction index is the concrete representation 
of the best weighting plus the other adjustments referred to above. 

5. The data concerning the occupational and educational inten- 
tions of the students, and other personal information were secured in 
most of the schools from a mimeographed blank, filled in by students 
under the direction of the examiner during the testing period. The 
blanks were not ready for use in the first schools tested. For these 
schools the blanks were sent to the superintendent with the request 
that they be completed. The high-school marks for four years were 
secured from the official records of each school. The test data are 
complete for 458 high-school seniors. The data are not complete 
for the other information due to the failure of a few superintendents 
or principals to make completer returns. But the deficiencies are 
small compared to the number for which information is complete, 
and we seem justified in assuming that all of the data represent fair 
samplings. 

IV. RESULTS 

1 . Distributions of test scores and prediction indexes. 


Table 1. — Summary of resuUt from 458 Yakima County kigh-tchool tenion 
combined, and 827 firti-year normal-Bchool sludenit, fall 1929 
IOWA COMPREHEXSION TEST 
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Q (qniDlde)— 1 

2 7 

! 

7.1 

I 27.3 

43-Sl 

30-28 

20 

7.6 

20l2 

42-33 

32-28 

37-21 

24-20 

s 

' 26-23 

4 

' 22-20 


10-00 

10-10 



Standard eiror of diflerenoe 

.00 

% 
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Tabld 1. Summary of resulU from 468 Yakima County high-school seniors 
combined, and £87 first-year normal- school students, fall 1929 — Continued 


DETROIT INTELLIGENCE TEST 


Measures 

High- 

school 

seniors 

Normal- 

school 

freshmen 

I * 

2 

2 

3 

Mean 

129 


Standard deviation 

31.7 

24 g 

31 g 

V.. 

25. 4 

Q-i 

227-153 

201-149 

2 

152-135 

1 AA- 1 ^ 1 

3 

134-120 

1^0^ JOI 

1 W-l lA 

4 

llf^l04 

IsJLr^ 1 lO 

114-103 

6 

103-60 

IfWL-IO 

Standard error of difference.. 

1 6 I 




_ _l 



NEW STANFORD ARITHMETIC TEST 


Mean * 

104 

im 

Standard deviation 

11 

10. 0 

iUO 
II 03 

V 

i 1. no 
in 7 

Q-l *» 

125-1 14 

111. I 

" lOi-l li 

2 

113-110 

1 1 H 

1 i3-inR 

3 

109-103 

1 lO 1 VO 

107-109 

4 

102-95 

IU< lUZ 
101-QA 

5 

94-55 

lU 1 VO 

01 - 7n 

Standard error of difference. , 

.90 

lU 





PURDUE ENGLISH TEST 


Mean I 

99 

99 

Standard deviation. 

17. 4 

15. 8 

V : 

17.’ 6 
143-115 

16 

13/V-l 1 A 

Q— 1-.: 

2 " 

114-105 

xmt^ 1 1 n 
1 11-101 

3 

104-97 

irruQA 

4 ; ^ 

96-57 

01-fi7 

5 

80-43 


Standard error of difference. : ... 

1.3 

• Ov no 





PREDICTION INDEX . 


Mean 

102 

101 

Standard deviation 

19.3 

19 

Avl 

20. 3 

V 



156-118 

mU 

119-1 Ifi 

2 

117-106 

1^1 1 iO 

1 17.1ftA 

3 . 

ill l\fQ 

105-97 

i l f — IGO 
Iftl-QA 


96-54 

VO 

Ql-AA 

5 " 

83-63 

in ~oo 
fU-Al 

Standard error of difference 

1.6 

Ol 




er|c 


It is to be noted that the means for the two groups in the Iowa 
Comprehension Test, the new Stanford Arithmetic Test, and Purdue 
English Test, and the prediction indexes are practically the same. 
The high-school seniors are slightly superior in the Detroit Advanced 
Intelligence Test, the difference being more than three times the stand- 
ard error of difference. The standard error of difference is a means of 
indicating the significance of a difference between two measures. 
It expresses the probability of such a difference being erroneous 
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because of smallness of sampling. If the standard error of diflFerence 
is equal to the diflference between two measures, there is approxi- 
mately one chance out of three that a complete sampling would show 
no difference. If the difference between two measures is three times 
the standard error of difference, the chances are 369 to 1 that there is 
a true difference. A 3 to 1 ratio between a difference and its standard 
error is considered by most statisticians to indicate practical certainty 
of the existence of such a difference. In illustration, the difference 
between the mean score of the high-school seniors in the Detroit 
Advanced IntelHgenoe Test and the mean score of the normal school 
freshmen in the same test is 5 points. The standard error of differ- 
ence is 1.6. The difference between the means, namely 5, is slightly 
more than three times the standard error of difference, which is 1.6. 
Therefore, the difference may be considered as almost certainly 
representing a true difference. On the other hand, the difference 
between the mean prediction indexes of the two groups is only 1 point. 
Thestandard error o^ difference is 1.6. Therefore, we can notsay that 
a difference has been established. The highest average prediction 
index for any high school was 113. The lowest average was 81. 

The variability of the normal school freshmen is very similar to 
that of the high-school seniors in all of the measures. The term 
variability refers to spread or dispersion within a group. High 
variability means a wide range of scor^ on either side of the mean. 
Low variability means a concentration of scores close to the mean. 
The st^dard deviation is considered to be the most reliable means of 
expressing the amount of variability. But the standard deviation is 
in terms of t^t units, so if a comparison is to be made between the 
standard deviations of two tests having different units, as is usually 
the case, the comparison is difficult to interpret. The coefficient 
of variability is in terms of the standard deviation and the mean. 
The mean may be thought of as the most representative score,* there- 
fore V, or the coefficient of variability, of any distribution may be 
compared wdth V of any other distribution, because all V’s are 
rendered comparable through the equating influence of the means. 
Comparison of the V's reveals some interesting information. Both 
groups of students are most spread out in reading ability, as indicated ' 
in the results of the Iowa Comprehension Test, the variability being 
almost twice that of a normal probability curve. The spread is also 
marked for intelligence test scores. The dispersion is close to that of 
a normal distribution for the prediction index a»d English Test 
resulte. The r^ults from the arithmetic tests show that the variation 
in this subject is very small, or in other words, that the students are 
more alike in arithmetic than in any other function measured in this 
survey. 

2. Comparison of boys with girls. 
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Table 2. Compariton oj kigh-school boys with high~school girls 


Test 

Boys 

Girls 

Nfean 

Standard 

devia- 

tion 

V 

Mean 

Standard 

devia- 

tion 

V 

Standard 
error of 
difference 

1 

2 

3 

( 

4 

& 

6 

7 

8 

Iowa Comprehension 

Detroit Advanced IntelllRence 

New Stanford Arithmetic. 

Purdue Enellsh 

Prediction Index 

20 

\Z2 

106 

07 

102 

7. 82 
20.82 1 
10.00 i 
17.56 ' 
10. 92 

1 

28 
22 
0 
18 
19 1 

20 

123 

101 

101 

102 

7.18 
27.99 
0.04 
17. 10 
19, 02 

27 

22 

6 

17 

18 

0.08 
2.70 
.80 ^ 
1.00 ^ 
1.80 


According to Table 2, the high-school girls are slighly superior in 
English. The boys are significantly ahead in the intelligence test 
scores and even more superior in arithmetic. There is no sex diflFer- 
ence in the reading test scores, nor in the prediction index. Compared 
to the total size of the scores, the diflferences are slight. Certainly 
these data give no basis for the opinion sometimes expres^pd by 
teachers that girls are brighter than boys. 

In all of the measures except the intelligence test, the high-school 
boys are slightly more variable than the g^irls. The only measure 
for which this greater spread is marked is the arithmetic test. 

Data secured over several quarters from nonnal school students 
indicate slightly greater variability among boys than among girls. 

Table 3. — Comparison of normal-school boys with nirmal-school girls 



Teat 




Mean score 

Standard 
error of 




i 

• 1 

Boys 

Girls 

dlHer- 

ence 


1 



! 

i 

2 

3 

4 

Iowa Oorapreheniion 




1 


OA 


Detroit Aavant^ed Intelligence. 





XO 

110 

ion 

1. 1 
A n 

New Stanford Arithmetic 





I lU 

MiiO 

4. U 

Purdue English 





lU* 

OA 

lllO 
1 An 

1. 3 

Prediction index 






JvKJ 
1 Ai 

2. 6 
o n 

— 





wo 


£, V 


Table 3 indicates that the normal school girls are superior to normal 
school boys in all teste. The superiority is negligible in the Iowa Com- 
prehension Test and in the arithmetic test, but is probably significant 
in the English test and quite marked in the intelligence test. 

The situation is quite different from that found among the high- 
school seniors. Owing to the limited sampling interpretations must 
be made cautiously, but apparently the normal school girls beginning 
their first year are slightly superior to high-school senior girls, while 
normal school boys are below the average of high-school senior boys. 

3. The relationship between size of high school and performance 
of seniors. 
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Table 4. — Relative standings of small, medium, and large high schools 
IOWA rOMl’REIlHNSION' TEST 
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Measures 

Small 

hlRh 

school 

Medium 

high 

school 

Yakima 

High 

School 

1 

2 

3 

4 

Mean 

i 24 


oa 

Standard devlaiion 

! 7.2 

7 A 

A 7 

Q-i 


/ . o 

0. i 

Ark Oj 

2 

1 ^ 1*>JV 

29-26 
25^23 1 

22-18 J 

1 7-5 1 

31-27 
1 26-23 
' 22-19 

1 W— n 

33-30 
29-27 
26-22 
Ot Q 

s 

4 “1 

6 

Standard error of difference^ between snioJI and medium 

.88 ' 

1 u 

zi-yt 

Standard error of difference between medium and large 

Standard error of difference between small and larpe 

X*. 1 

. u ' 

1 

\ 

.84 



— - ... 

j 

1 


DTCTROIT INTELLIQENCE TEST 


Mean 

12Q 

irf ' 

90 A 

1 Q9 

Standard deviation 

32. g 

10/ 

71 9 


227-148 

£v. O 

Ol. £ 
907 1 Al 

2 • 


101 


3 

# — lOU 

129-1 14 

l9tr-loo 
1 79-1 9n 

lotf-iea 

s An 1 or* 

4 •. : 


lOx 1 
1 1 0.101 

10Q_1 17 

6 ‘ ■ 

97-60 

1 1 ir-lUl 
ino-71 

12V-113 
1 1 1— Afl 

Standard error of difference between small and medium 

a8 


111-05 

Standard error of difference between medium and large 

3.3 



Standard error of difference between small and largo 

3.9 



— 




NEW STANFORD ARTTHMETIC TEST 


Mean 

102 


1 AA 

Standard deviation 

12. 3 

Iw 
11.1 
1 99-1 li 

lUO 

in 9 

Q— 1 

119-112 

IU. d 
1 9 A- 1 1 A 

2 

1 1 1-107 

1 17 
1 1 7- 1 1 n 

1 A4>-1 10 
1 1^1 1 1 

3 

106- 100 

J Xu 1 iU 

mo-ioi 

114-111 
1 in. inA 

4 

99-92 

in.7-QA 

1 lU* lUO 
inA—QQ 

6 

91-65 

1 lAI 

oa-71 

1UO-W1 

07-AQ 

Standard error of difference between small and m^ium 

1.4 

W/- t 1 

¥/— OV 

Standard error of difference between medium and large 

1.1 



Standard error of difference between small and larve 

1.4 



— — 





PURDUE ENaLIsn TEST 


• 

Mean 

93 

07 

ina 

Standard deviation. 

18. 8 

91 

17 A 

lUO 
1A A 


128-107 

1 1 . o 
149-1 14 

lO. 0 

2 

lf¥V-07 

1 17 
1 1 T-IAA 

if <r^l AF 
1 IQ— ino 

3 . 

96-80 

i Xij-lUO 
104- QA 

llV-llfV 
inft ini 

4 

88-74 

XXF7— ¥u 
04-AA 

llAr-lUl 

6 

'78-M 

wrrV} 

ft4— 41 

1UU-W4 

07—47 

Standard error of difference between small and medium 

22 

07 IX 

WJ— 44 

Standard error of diffare oce between medium and large 

1.8 

I 

Standard error of difference between small and large 

2.2 

1 


1 


PREDICTION INDEXES 


Mean. * i 

1 96 

101 

lOfi 

Standard deviation 

19.8 

146-113 

XUl 
10 9 

IW 
10 7 

Q— 1 

XV. 4 

1IUV1 Ifl 

la s 

2 

112-101 

X lO 

1 17-11)0 

100—140 
194—1 19 

3 .• 

lt)0-93 

92-77 

X X / — lUV 

ing-QA 

149—1 14 
1 1 t_iOO 

4 

07-flA 

X 1 I— IU4 
101—07 

6 

76-64 

vt oo 
RS-A7 

1UI~1AS 

09-A1 

Standard error of difference between sin^ and medium 

2.3 

OirNM 

)T4-01 

Standard error of difference between medium and largo.. 

ZO 



Standard error of difference between small and large.. . 

Z4 

1 




1 
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Nine high schools were included in the araall-school group. The 
average number of seniors per school in this group was 13. The range 
was from 10 to 17 seniors. 

Four schools were classified as medium-sized'. The average number 
of seniors was 44, with a range of 29 to 54, 

One school fell in the largo school category with 170 seniors taking 
the tests. 

It will be noted that there is a direct relationship between size of 
score in each test and the size of the schools when classified in the 
manner referred to above. There are individual exceptions to this, 
however. The school, the seniors of which rank first in average score 
in each test and in the prediction index, is in the small-school group. 

4. Relationship between marks and prediction index. 

Table b.-— Correlations between high-school marks and prediction indexes 


1 


Marks for 4 years 

Marks for second, thlrrl, and fourth years. .C. 

Marks for third and fourth years 

Marks for fourth year - 

Enitliah marka. 

Mathftmalics marks 

Science marks 

Foreign-languiiRe marks - 

8oCial-«c:ence marks 

Commercial marks 

Music and drawinjr marks 

Home economics and industrial arts 


Number 
of caaea 

CoefD- 
cient of 
correla- 
tion with 
predic- 
tion in- 
4lex 

Probable 
error of 
coefD- 
OPDt of 
correla- 
tion 

2 

3 

4 

339 

a3C7 

0.032 

343 

.371 

.032 

343 

.394 

.031 

342 

.379 

.031 

343 

.423 

.030 

333 

.304 

.034 

341 

.312 

.033 

274 

.397 

.037 

244 

.315, 

1 .039 

223 

. 155 1 

1 .044 

57 

. 142 

1 .089 

155 

.105 

1 .052 


The differences between correlations with the prediction index, of 
4 years of high-school marks, the last 3 years, the last 2 years, and 
the last year are so slight as to be negligible. In other words the last 
2 years of high-school work correlate at least as well with the battery 
of tests used as do the marks for all 4 years. As the composite of 
the tests has a proved prediction value for college marks, it woul^^ 
seem that much clerical labor might be saved if the higher institutions 
requested high-school principals to draw up transcripts of marks for 
the last 2 years only. Of course students would have to be certified 
as having met high-school graduation requirements, and as having 
taken certain prerequisites for entrance to specific courses. 



STUDENT PERSONNEL STUDIES 


59 


It is interesting to note that marks in English and foreign language 
correlate significantly higher with the prediction index than do those 
for any of the other subjects. In fact, either one of them is at least 
as predictive as the composite of all marks for any number of years. 
Marks in commercial subjects, music and drawing, home economics, 
and industrial arts correlate so poPrly as to be practically' valueless for ■ 
general prediction purposes. 


Table 6. Correlations of normal-school marks, fall quarter, 1929, wHh test scores, 
prediction indexes, and high-school marks 



Purdue 

English 

New ' 
Stanford 
Arithme- 
tic 1 

' -Detroit 
Advanced 
, Intelli^ 

1 gence 

Jowa 
Compre- 
1 hension 

1 ’ ; 

Prod 1 C- i 
tion j 
index 

I 

Kigh- 
.school 
marks ^ 

1 

3 

3 

1 

! 4 

1 

5 i 

1 

7 

Normal-school marks 

of 55 
.02 

1 

0.47 

.04 

1 

0.58 

ca. 

i 

0.61 1 

0.79 j 

0.«l 

.05 

Protxable error 



.1X5 1 

1 

. uJ 


Correlation coefficients were computed between the teat scores of 
normal-school students, fall quarter 1929, and their scholastic marks 
for the ensuing quarter. Also a coefficient of correlation w'as com- 
puted between mean marks and the composite or prediction index. 
These are presented in Table 6. A correlation coefficient is also in- 
cluded between normal-school marks and four years of high-school 
marks. It is to be noted that the Iowa Comprehension or reading test \ 
show^s the closest relation with academic standing in the normal school, j 
and that the arithmetic test shows the lowest agreement. The predic- I 
tion index gives a significantly higher correspondence than any one of 
the tests. 

It is surprising that the high-school marks do pot show a higher ■ 
correspondence with normal school marks than they do. The com- 
posite of four years of high-school marks show a definitely' smaller 
correlation with normal school marks than does any one of the tests 
and a very much smaller agreement than does the prediction index. 
This may bo partially accounted for by the wdde variations found in ' 
the marking systems and the difficulty in equating them satisfactorily. 
Also, quite different standards of severity of marking obtain in different 
schools and for different teachers in the same school. 

5. Acceleration and retardation and the predation index. 

107121—32 '5 
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Table 7. — Relation of varying degrees of acceleration and retardation with test 

scores and prediction indexes 


ELEMENTARY SCHOOL 




Mean 


Degree of scoelemioD or re tArdatioD 

Iowa 

Detroit 

Btanfonl 

1 

Purdue 

English 

1 

Predic- 


1 Compre- 

Intelli' 

Arithme- 

tion 


henaioD 

1 

geoce 

tic 

indexes 

• 

1 

2 

3 

4 

5 

1 

6 

1 

Aooeleratetl 3 yearn 

1 ^ 

170 

111 

103 

126 

AoceJeratad 2 years * 


142 

104 

103 

106 

Accelerated 1 year 

1 29 

143 

109 

107 

112 

Nonna) rate 

; 26 

128 

104 

og 

101 

Retarded 1 year 

! 22 

111 

00 

81 

87 

Retarded 2 years or more 

1 

126 

00 

84 

80 


HIGH SCHOOL 


tr AooeleraUd 1 year 

AoceJerated ^ year 

Normal rate 

Retarded H year 

Retarded 1 year or more 


30 

140 

113 

115 

104 

26 

126 

106 

07 

100 

26 

131 

106 

100 


26 

M96 

06 

06 

^ 100 

23 

117 

103 

01 

04 


Those pupils who were accelerated three years in the elementary 
school show a marked superiority in all tests. and in the prediction 
index. The ones accelerated one or two years are superior in the 
prediction index but not in all of the tests. 

The pupils who were retarded one or two years in elementary school 
are significantly below those who tave progressed regularly, in all test 
scores and in the prediction index. Apparently there is little difference 
between those retarded one year and those retarded two years. The . 
number of cases in each category is too small to give results of much 
worth. 

The data on high-school irregularity of progress may be interpreted 
in a somewhat similar fashion. \ * * 

6. Rural and town schooling compared. 


Table 8. — Relation of mean tcoret of var^ng proporlione of rural and town school- 
ing to lest sewes and prediction indexes 



Number 
of oases 

1 

Iowa 

Compre- 
^ bansioD 

1 

Detroit 

Advanced 

Intelli- 

ganoe 

Stanford 

Arithme- 

tlo 

Purdue 

English 

Predic- 

tion 

index 

1 

1 

8 

4 

5 

6 

7 

All rural 

66 

24 

124 

104 

11.3 

03 

17.8 

07 

20.1 

07 

101 

101 

00 

103 

106 

18.8 

IS 

Standard davlatioo all rural 

6.8 

32.4 

rural, J4 town 

8 

24 

118 

106 

06 

H rural, M town 

20 

26 

126 

106 

06 

K rural, M town 

18 

26 

131 

106 

101 

100 

100 

103 

16.7 

13 

U rural* H town 

33 

26 

123 

100 
107 
106 
lOi 1 

V? ruraL H town 


36 

U1 

136 

28l6 

All town 

•17# 

27 

7.2 

Standard dariation all town 

Standard erne oi differaooa betwaan all 
rural and aU town _ _ 


LO 

4.6 

L6 
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A rural school is arbitrarily defined as a 1 or 2 room school. A 
school haying three or more rooms is defined as a town school. The 
results shown in Table 8 indicate that those receiving all or most of 
their elementary schooling -in town schools test higher than those 
whose schooling has been entirely or mostly rural. The differences 
foirfd between the intermediate proportions are inconclusive. These 
findmgs, of course, are not necessarily indicative of the superiority of 
town elementary schools. The native talent of children living in 
towns may be a factor. 

7. The prediction index anih^ccupation. 

Table 9 . Rdalion between occupations of fathers and prediction indexes of high- 
^ school seniors • 



Occupations of fathers 


j sioitf 1 

■ Clerical | 

Salesmen and skilled Farmers : Laboren 
labor 1 

1 ' I , 

' i ^ ! 3 

1 • - - 

4 j 5 6 ' 7 

1 

Per cent of total group . . | 7 

Mean prediction index of children 1 J 12 110 ' 

si ifi : 53 ! 7 

us 1 . 103 j W 1 100 


In the above table the percentage of farmers is much greater than 
for the population- of the entire State. Thi.s is to be expected, of course, 
in an agricultural valley such as coinpris^ the inhabited portion of 
Yakima County. 

The high-school seniors whose fathers are in the professions, in ^ 
busihess, or who are salesmen make significantly higher sedres than 
those whose fatherslare clerical workers, skilled artisans, or farmers. 
The average performance of th<^e whose fathers are unskilled laborers 
is distorted by the very high ^ores of two individuals. The total 
number of laborer’s children is so small that these two cases have an 
unusual weight. There isjnuch overlapping between the occupational 
groups. Therefore, occupation of father seems to be an unsound 
basis for the prediction of the scholastic performance of the offspring. 

Table 10. Occupatiohal intentions of seniors compared with their 
• ^ prediction indexes 


Oocupational choice of student 




1 / 



1 




Profes- 

sional 

DusinesB 

Salesman 

Clerical 
and skilled 
labor 

1 

Farmer 1 

i 

1 Laborer 
-» 

* ■ ' 1 

1 

* 

3 

4 

1 

A 

1 

6 1 

i 

1 

7 

Per cent of sroup 

46 

109 

1 

7 

1 

1 

1 1 1 


7 ! 
07, 


Memo ixt^ciioD index 

102 

40 

0 


111 

w 
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Almost half of the group expresses an intention of. entering ft 
profession. It is to he feared that some of them will be disappointed, 
as this is several times the percentage of people actualjy engaged in 
the professions. Only 7 per cent of the high-school seniors desire to 
be farmers. It seems regrettable that such a small percentage wishes 
to engage in the principal activity of Yakima County. 

The number electing to become salesmen is so small, only 1 per 
cent,, that it can not^)e accepted as a fair sampling as far as the pre- 
diction index is concerned. Omitting this group from consideration, 
the prediction indexes vary consistently in the same general manner as 
do the prediction indexes of the offspring of those included in the 
occupational groupings in Table 9. In brief, there is small but prob- 
ably significant relationship between the prediction indexes and the 
occupational intentions of the high-school seniors of Yakima County. 
, 8. Relatiopship between educational intentions and the prediction 
index. 

^ Table II . — Test scores and prediction indexes in relation to intentions of 

< continuing formal education 


' 1 

' l\'r cent 
of total 

i 

1 

' Mean scores 

Detroit 

hrnMon . 

1 

Anthmo- 

i 

Predic* 

tion 

index 

1 2 

1 

i 

1 

5 1 6 

7 

1 

Students continuing with education _J 91 

Students not going on to school l 9 

28 129 

25 1 125 

i 

106 j 102 

103 1 94 

106 

99 


The averAge scores of students who plan on continuing their formal 
education beyond the high school are slightly higher in all of the tests 
and in the prediction index. The smallness of the superiority of those 
intending to attend college would indicate that aptitude for higher 
edufcation is a small factor in influencing the decision of the pupil or 
of his parents. An untabulated analysis of the data shows that a 
large percentage of low-test-score people intend entering higher 
, educational institutions. Many of these people will be eliminated at 
considerable expense to the State, and, perhaps more important, with 
attendant personal humiliation and sense of failure. There would 
seem to be a vital need for a program of guidance involving both the 
high schools and the higher institutions. 
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Table 12. Relationship between different types of higher educational 
intentions and test performance 

M scores 


Per cent 


I 2 


Liberal arts ]s 

administration ; 

Teaching (pret>araliOn other than normal 

school) 7 

Engineering, science, etc ir,' 

Professional, other than leaching 

Teaching (normal school preparation) 7 

Business college 20 

Vndecided as to course ’ r, 

College of agriculture 4 


Standard error of difference between lii>eral 

arts and normal school 

Standard error of difference l>etw'een nor- 
mal school and agricultural college 


Inwa 
C ompre- 
hcnsion 

1 Detroit 
Advancetl 
Intelli- 
gence 

r 

Stanford ' 
A rithmc- , 
tic 

Fhirdiie 

English 

Predic- 

tion 

index 

3 

■ j 

4 

r» , 

6 

7 


147 

104 

- Ill 

113 

2s 

MO 

no 

IMi 

110 

2V 

, 1.30 

m 

104 
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The above table should require very little interpretation. The 
people planning on attending a normal school made scores, which, on 
the average, approximated those of the en|ire, group of high-school 
seniors, the prediction index being exactly the same. 


V. INTERPRETATION AND SUM. MARY 

1. ^Iean scores of 227 first-quarter nonnal-school students are 
recorded in Table 1 (p. .53). In Table 6 (p. 59) correlations are 
given between the scores of 212 of these students and their first- 
quarter nprmal-schpol marks. WTien the letter marks are translated 
into an arbitrary numerical scale ranging from E-0 to A-10, the 
average is slightly in excess of 5 or C plus. The average marks of 
the respective prediction-index quintiles or fifths are given in Table 13. 
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RUndard error of difference between means of normal-school marks of Q— 1 and Q— .3-0 37. 
Standard error of difference between means of normal-school marks of Q — 3 and Q— fi-0.30. 
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One-half of those in the* lowest prediction-index quintile do work 
averaging D or lower. D is passing but not satisfactory, and not 
more than one-fourth of a student’s credits may be D. 

Thirty -one people, or about 15 per cent of the group, earned a 
prediction index of 80 or below. The mean mark for the first quarter 
of 1929 for this group was 2.3, a trifle above D upon the above- 
mentioned scale. Only one person in this group received marks as 
high.as the average for all first-year students. Twenty-sLx of the 31 
received marks which were definitely unsatisfactory. These data for- 
tified by data from previous years, would indicate that students with 
prediction indexes of less than 80 are very unhkely to do satisfactory 
normal-school work. 

On the other hand no student in the upper fifth of the prediction- 
index distribution did definitely unsatisfactory work. Seven from 
43 were below the average for the student body, but none of these 
was below an average of C. 

The data presented in Table 13 and in the above statements 
should indicate, with a fair degree of reliability, the type of academic 
work to be expected from high-school graduates of varying predic- 
tion-index levels. This is particularly true of the upper and lower 
ranges. 

2. Summary of results. — A brief summary of the result^ecorded 
in the tables and in the accompanying interpretative data follows. 
These conclusions are based upon the findings from 458 Yakima 
County high-school seniors and 227 first-year normal-school students. 
They are valid onlj' in so far as these two groiips are representative. 

(а) Comparison of first-year normal-school students with high-school 
seniors. — (1 ) The average test performance of first-year normal-school 
students is about the same as that of Yakima County high-school 
seniors. (2) The highest high-school senior scores are above the 
highest normal-school scores. (3) The lowest scores are about the 
same for the two groups. (4) The normal school received a slightly 
higher proportion of the low-prediction people than of the very high 
prediction students. (5) Normal-school students and high-school 
seniors show similar wide variation in reading ability and narrow 
spread in arithmetic. 

(б) Sex differences. — (1) Sex differences in test performances are 
small or negligible for high-school seniors. (2) Nonnal-school girls are 
slightly ahead of normal-school boys in two tests, and significantly 
ahead in the remaining two tests. The girls have an advantage 
of 9 points in prediction index, a difference which is statistically 
significant. 

(c) Size of school as a factor. — There seems to be a tendency for 
the performance of high-school seniors to be somewhat in proportion 
to the size of the school. While this is true for the averages of the 
different size clarifications, there are individual exceptions. 
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(d) Reliability of tests in prediction. — (1) The high-school marks 
for the last two years, or for the last year, correlate as well, if not 
better, with the prediction index than do the marks for three or four 
years. (2) English and foreign language marks correlate better 
with the prediction index than all of the marks for any number of years. 
They also show' significantly better correlations than any other subject 
groupings. (3) Commercial marks, marks in music and drawing, and 
those in home economics and industrial arts show negligible correlation 
with the prediction index. (4) Any one of the tests used has shown 
better correlation w'ith normal-school marks than the average of four 
years of high-school marks. (5) The prediction index ha.s a very much 
higher prediction v’alue than any one of the tests or high-school marks. 

(e) Acceleration and retardation. — (1) There is a definite tendency 

for those high-school seniors who have been accelerated in either 
elem^tar>' school or high school to obtain higher prediction indexes 
than the average. (2) There is a tendency for those seniors who » 
have been retarded in either elementary school or in high school to/' 
earn lower prediction indexes than the average. ^ 

if) Rural-urban factor in elementar}' schooling. — There is a slight 
tendency for those ha^'^ng all or most of their elementary schooling 
in town schools to receive higher prediction indexes than those having 
all or most of their elementary schooling in rural schools. 

(p) Occupation of parents. — There is a tendency for the children 
of professional men and those engaging in business to obtain higher 
prediction indexes than the children of artisans or farmers. 

(A) Vocational plans. — (1) There is a tendency for high-school 
seniors who plan on entering a profession or business to have higher 
prediction indices than those who e.xpect to engage in clerical work, 
skilled labor, or farming. (2) The percentage of seniors who are 
anticipating a white-collar job is probably much greater than can be 
accommodated. (3) Apparently fanning needs to be made more 
attractive to high-school graduates. (4) More than >90 per cent of 
the Yakima County high-school seniors e.xpcct^^ontinue with formal 
education. (5) There is a slight but probably significant relation 
between the intention of going on to college and aptitude for college 
work. (6) A considerable number of seniors expect to expose them- 
selves to higher academic training who could probably profit more 
from some other type of training. (7) Table 12 (p. 63), shows the 
average scores made by those expecting to enter different types of 
higher education. 

In interpreting the above correlations and apparent tendencies, 
we must be cautious in assigning causal relations. The fact of correla- 
tion does not prove causal sequence. The many factors involved are 
probably interlinked with each other in a complex interdependent 
fashion. Any factor may be as logically considered a depepdent 
variable as an independent variable. 
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In conclusion, the writer realizes the limitations of this study. It 
is hoped that the survey may be charitably interpreted as an attempt 
to illuminate, however faintly, some of the many specific problems of 
the Ellensburg Normal School. Quite different results might have 
appeared if the seniors of large city high schools had been included. 
Such a survey carried on in other States might vdeld very different 
returns. The usefulness of the battery' of tests employ^ed w'ould al- 
most surely vary for other types of higher institutions. It is not 
assumed that any of the findings should be generalized withoiit much 
more corroboration than is available. 


REMEDIAL READING INSTRUCTION AS A PHASE OF 
PERSONNEL WORK IN HIGHER EDUCATION 

Frank W. Parr * 

I. IMPORTANCE OF THE PROBLEM 

The purpose of this paper is to give a brief discussion of remedial 
reading instnjction as a phase of personnel work in higher education. 
It is needless to say that one can not'do justice to such a comprehensive 
and important topic as this in the short time allotted. However, I 
shall attempt as best I can to suggest in this paper the need for, and 
the nature and e.xtent of, a'procedure that might be used, and the 
effects of a program in remedial reading on the college level. 

That poor reading ability is a distinct handicap to c^ege students 
has been pointed out by such ‘authorities as Morrison-, Book, the 
Presseys, Reinmers, Lemon, and others. In discussing the diagnosis 
of pupil diflicultics, Morrison says that “cases are occasionally found 
in which pupils progress incredibly with very slender reading ability — 
a very considerable number of pupils find their way into high school 
and even into the college without the reading adaptation. They can 
get the meaning of the printed page, but they do so laborioiisly by a 
process of deciphering. In effect they are usually slow students, and 
when they reach the subjects which require assimilation by extensive 
reading they become problem cases. They can not study effectively 
subjects which require extensive reading because they can not reflect 
upon the meaning as they read” (2).^ Lemon found that practically \ 
every member of his group of problem cases at the University of Iowa 
had a marked deficiency in reading, as did Remmers with his group 
at Purdue University. Book, working with freshmen at Indiana- 
University wl^o were unsuccessful in their university work, said that 
he foqnd that these students were very deficient in tjjeir ability to 
read and had to be given special help in learning to read more effec- 
tively before tfley could succeed w'ith their academic work, He says 
that his experience with these students “clearly showed that the 
difficulties which they w^ere encountering were chiefly due to their 
inability to read, and to wrong methods of work” (1). The writer 
a few yeiirs ago made a study of the poor readers among the freshmen 
who entered the University of Iowa in the fall of that year. He 
found that 63 per cent of the 350 poor readers received scholastic j 
delinquency reports at the midsemester of their first semester's work, 
with an average of 5.7 hours work delinquent per student. Forty- 

* F. W. Parr, professor of secondary education, Orcfton Slate College. B. S., University of Illinois, 1W6; 
M. A.. University of Iowa, 1976. Ph. D., 1929. Publications; With E. R. Isvik, ♦'Handwriting In the High 
School/' School Review 1 35:776-779, December, 1927; “A Remedial Program (or the Inefflcient Silent Reader 
In College/' Phl DeUa Happen. 12:68, August, 1929; with C. L. Nemiek, - What Becomes of the Inefflcient 
Silent Reader In College," f^eahodf Journal of Education, 7:299-303, March, 1930; "The Extent of Remedial 
Work In State Universities In the United SUtes," School and Socieiv. 31:647-648, April, 1930; "How Do 
College Students Prepare an Assignment," School and Soeieif. 81:712-713, May. 1930; "Teaching College 
Students How to Read," Journal of Higher Education, 2:326-331, June, 1931. 
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nine per cent of the^rades received by this group at the end of the 
(' first semester wereWlow C. By the beginning of the second semester 
110, or 32 per cent of this group had been eliminated from college. 

The importance of the situation is well stated by Schultz and 
Miller who report a reading investigation which was carried on at 
Christian College at Columbia, Mo. “The enrollment of first-year 
college students suffers appro.ximately a 40 per cent mortality during 
each year. Some educators attribute the failure of many of these 
students to poor reading ability. All of the studies which have had 
to do with reading at the college level reveal the fact that students 
show a tremendous variation in ability .to read. In extreme cases 
certain individuals have been found to read nineteen times as effec- 
tively as others in the same class. Since a college education must 
come largely through the medium of reading, any scheme which will 
improve reading with a reasonable expenditure of time and effort 
can certainly be justified on the basis of aggregate benefits which will 
accrue throughout a 4-year college course" (4). 

II. THE EXTENT AND NATURE OF REMEDIAL READING INSTRUCTION 
IN COLLEGES THROUGHOUT THE UNITED STATES 

Anyone interested in the remedial phase of reading would be 
impressed in examining our educational journals with the number of 
studies reported which have been carried on in the elementary school. 
He would he equally impressed with the paucity *of reports pertaining 
to the upper levels of the school system. Two possible explanations 
might be advanced to throw light on this situation. In the first 
place, it may be that educators assume that students of collie age 
have an adequate mastery of the reading process, and therefore need 
no further training along this line. A number of experiments have 
been carried on to prove that this assumption is quite erroneous. A 
more plausible explanation for the paucity of material on the college 
level is that those men who have been carrying on such w ork have been 
negligent in reporting it. That is, it is reasonable to assume that only 
a small per cent of the studies in any field of endeavor is reported in 
our educational journals. 

In order to get more complete infonnation on the extent and nature 
of^i^iedial reading instruction in this country the writer two years 
aji^ent a letter to every State university in the United States. 
This letter, which was in the form of a questionnaire, was addressed to 
the dean of the college of education at each institutibn. The following 
is a summary of the data received from the 40 schools that returned 
the questionnaire. (1) Only 9 schools made any attempt to discover 
the poor readers among their freshmen. (2) Only 7 of these schools 
had a plan for assisting the poor readers. (3) When remedial work 
was offered it was usually under the supervision of the college of 
education, although the psychology department assisted in the work 


STUDENT PERSONNEL STUDIES 


69 


at 3 of the schools. (4) The remedial instruction when offered is a 
phase of a “How to Study “ course. Seven of the 9 schools offered it 
in this manner. (5)' Only 4 schools made the remedial work compul- 
sory for those in the freshman class who were in need of such instruc- 
tion. (6) Only 4 schools gave college crei^it for the remedial instmc- 
tion. (7) There was no standard practice as to the length of time 
devoted to the remedial work or to the frequency of class meetings. 
The range in the length of time given was frorp 2 weeks to 36 weeks, 
and the number of meetings held ranged from one ever}' two weeks to 
3 meetings per week. (S) Five schools reported that they used a 
syllabus or workbook in connection with the remedial work. (9) In 
reply to the question, “Do y^u have any evidence that this york 
improves the reading ability of the students:”’ Only 5 schools 
replied in the affirmative. Five schools also claimed that they had 
evidence to show that the students did better college work in general 
as a residt of their improved reading ability. (10') A number of the 
schools described briefly the nature of the remedial program. Some 
of these descriptions were: “Course in psychology of reading.” “Only 
locate the trouble,’’ “Mostly throat rela.xation and increased eye 
span,” “Merely diagnose reading comprehension of freshman — no 
remedial treatment.” 

That a great deal of interest is being manifested in4his problem of 
remedial training in reading on the college, level is indicated by the 
fact that deans of 16 schools where no remedial program was provided' 
made comments on their questionnaires e.xpressing keen interest in, 
and approval of such work. 

III. SUGGEISTED PROCEDURE FOR C.\RRY1.NG ON .\ REMEDI\L 
RH\DING PROGR.\M ON THE COLLEGE LEVEL 

In planning a remedial program for the poor reader in college one 
should follow some well-defined procedure, which would probably 
incorporate the following steps: (1) A complete case history of each 
student. This should give information concerning the student’s 
school history^ physical record, reading habits, emotional charac- 
teristics, and study habits. (2) An adequate diagnosis of disabilities 
of each student. This will necessitate the use of two principal 
media viz, observation and testing. Through observation wb must 
note the frequency.abd nature of the eye-movements, vocalization or 
lip reading, finger pointing, visual defects, etc. By means of tests we 
may get information concerning the general mental ability of the 
student, which is a very essential type of information for any remedial ' 
program, his general reading ability, and his specific reading abilities. 

A good diagnostic test should be used which will point out specifically 
the deficiencies in the various skills in reading^ (e. g., Iowa silent reading 
test. University of Minnesota reading test). (3) A remedial program 
based upon the analysis of deficiencies should be set up for each 
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student. Remedial drill exercises should be provided to care for each 
deficiency. It is needless to say that the remedial training for each 
student should be designed to fit his needs, and since no two students 
present exactly the same combination of deficiencies, the remedial 
instruction should be given individually if at all possible. It is well 
to test each student at least once during the course of the remedial 
instruction, and again at the end of the program so that both the 
student, and the teacher may know just how much progress has been 
made.' The more highly the student is motivated during the remedial 
program, other things being equal, the better will the results be. 

IV. THK EFFECTS OF THE REMEDIAL PROGRAM 

Published reports of studies which have been carried on in this 
field seem to indicate that remedial instruction in reading on the 
college level may be extremely beneficial to the student. Reference 
will be made at this time to a few of the outstanding programs that 
have come to the writer’s attention. 

Probably the most comprehensive program to be reported in reme- 
dial reading instruction is that which is under the direction of Drs. 
L. C. and S. L. Pressey at Ohio State I’niversity. A reading test 
is given to all freshmen entering that school, and- all of the poor 
readers are required to attend a remedial cl.ass until their reading 
deficiencies are removed. Since this work has been carried on for a 
number of years the Prosseys now have some evidence to shoxv the 
effect of the instruction. In a controlled cx[)erimen't, a group of 422 
poor readers at Ohio State I’niversity was paired with a similar group 
. which was not given remedial instruction. The experimental group 
far excelled the control group both in improvement in reading and in 
scholarship. In commenting on her study, Mrs. Pressey says, “It 
seems quite evident from this investigation that it is possible to train 
students to read effectively and that such training is more likely than 
not to transfer to the preparation of lessons and to general under- 
standing of college work ” (3). 

Book working with 54 students at Indiana University reports 
that these students increased their reading efiiciency on the average 
102 per cent. Some of his group showed improvements as high as 250 
per cent. The ability of the group to master a standardized assign- 
ment had likewise increased from 60 to 97 per cent. 

Remmers and Stalnaker at Piirdue I’niversity gave remedial train- 
ing to 7 freshmen students who seored in the lowest quartilc on the 
American Council Psychological Examination. The results show an 
average gain of 24.6 per ^nt in rate of reading and a similar gain in 
comprehension. » 


In an unpublished study Schultz and Miller describe an experiment 
carried on at Christian College, Columbia, Mo. Gains in reading cona- 
prehension for a giviip of 27 poor readers ranged from 0 to 1 14 percent. 
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In the June, 1931, issue of the Journal of Higher Education, the 
writer describes on experiment which "he conducted with 20 students 
at the L'niversity of Iowa, and shows the efforts of a well-organizod 
program of remedial instruction in reading. Gains in both compre- 
hension and rate of reading were made by each of the 20 students; 
some of the gains being well over the 100 per cent murk. Not only 
did these students improve materially in reading ability, but they also 
showed gains in scholai-ship records. With the exception of four 
students, the members of the remedial class made gains in scholarship 
overage. During the semester in which the remedial instruction was 
given, nine, or 45 per cent, of these students earnejl their highest grade 
point averages for any single semester in college. A follow-up study 
of the 16 students who were enrolled in the university the year follow- 
ing the remedial instruction showed that these students continued 
to improve in scholarship. Eighty-two per cent of the group earned 
scholarship averages which ecjualled or e.xcelled those for the previous 
year. Six of the nine students who made record averages during the 
period of the remedial instruction earned even higher averages for the 
following year. Three other members of the remedial class also made 
record averages for the year following the reading instruction. 

While the e.xperiment just cited was carried on with upperclassmen, 
it is reasonable to ^suine that comparable results would have been 
obtained w'ith an underclass group. 

It is interesting to note that most of the investigators who have ' | 
worked in this field agree that students profit by this remedial in- '■ 
struction in proportion to their mental ability. It probably does not j, 
pay to spend time and money on the subnormal group, for the irn- ( i 
provement does not justify the expenditure. ■ 

In conclusion, the writer has tried to show (1) that there is a need 
for providing some type oT remedial instruction for the poor readers 
who come, and probably will continue to come, to our colleges; (2) 
that there are but few schools throughout the country that give any 
attention at all to their poor readers; (3) that one must have a well- 
planned program for carrying on such remedial instruction; and (4) 
that the results obtained from studies of remedial reading seem to 
warrant the attention wduch is being given to this important phase of 
educational work. 
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THK PREDICTION OF SUCCESS IN ENGLISH COMPOSITION 

L KENNirra Shitmakeb * 

The prediction of success in English composition is a problem which 
has long been a serious concern of teachers of English. The multitude 
of factors which enter into the problem and the varying importance 
attached to these factors by independent workers in the field have 
served only to create confusion. It is our purpose to show to what 
extent certain research carried on in the English bureau of the Uni- 
versity of Oregon has served to define and limit the different factors 
of the problem and to solve some of its difficulties. .• 

Most composition teachers will agree that they are interested in 
teaching students to write organic English. They begin to differ 
immediately as soon as various contents of courses, methods of in- 
struction, and tests for measuring the results of instruction are con- 
siMred. Let any group of English teachers attempt as innocent a 
task as setting up “ minimum essentials” and the troubles become 
instantly apparent. ‘ ^ 

In order to accomplish an;^ing, it becomes imperative that terms 
and objectives be defined wdth greatest clarity. The universe of dis- 
course produced thus arbitrarily, gives a known point of departure 
from which measurable progress may be determined and calibrated. 
In the English bureau, our objective was stated in this manner: Let 
us demand that no student be admitted to any college class in English 
composition until he can write an organic sentence. An organic sen- 
tence means a group of related, arbitrary symbols (words) which 
communicate a complete thought. The presupposition is allowed 
that thinking takes place within the mind of any given individual by 
means of the proper combination of ideas; and that the correlation 
between the effectiveness of the thinking process and the representa- 
tion of it by means of words as arbitrary, common symbols of knowl- 
edge, accepted by two or more individuals, differ among individuals 
in unk nown ratio. The tendency is that the clearest- thinking is 
easiest to express in words, and that the clearest thinker tends to 
have at his command the largest stock of words in which to represent 
his ideas. There is no reason to presuppose that an individual who 
is inarticulate iu words may not be preeminently articulate in music, 
painting, sculpture, or even mathematical eymbols. It is also taken 
for granted that not only must a certain wprd be the symbol for one 
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idea in a given context, but that the arrangement of the words must^ 
follow a definite pattern as thoroughly agreed upon as the dehnition 
of the word itself within the pattern. 

There are many ways in which to acquire the know ledge of words 
and word patterns, but the most commonly accepted methods are (a) 
through a study of rules (formal grammar) which state concisely, 
observed behavior of word patterns (sentence structure) ; or (b) through 
habitual reading, hearing, writing, and speaking in which symbols and 

patterns become habitual in the individual without his awareness of 

# 

any names of descriptions of these habits. The teacher in elementary 
and secondary schools attempts to employ the first of these methods 
in the recognized “language" courses of the public schools. The 
everyday contacts of the individual from the time he first learns to 
say “papa" and “mamma" tend to fix habits w’hich may not accord 
in any way with the formal rules taught in the classroom. There is 
every reason to recognize tha/t the second method has the deepest 
and most lasting impression upon the mind of the individual because 
(a) to appreciate the significance of formal rules requires a certain 
ability to think abstractly, and (6) to acquire the habits of companions 
with whom an individual is continually thrown is easier than to learn 
to obey the arbitrary dicta laid down in 1 short hour of the 24, two, 
three, or five times a week during three-fourths of the year. 

With these presuppositions in mind, therefore, it seems more 
logical to attempt to determine, first, the student’s language sense, 
or general aptitude for language; and, second, to'diagnose his specific 
difficulties with language before attempting t*> instruct him in college 
classes in English. Expediency also renders highly desirable the 
employment of tests which are as objective as possible for measuring 
.aptitude and for diagnosing difficulties." There is a welUknown tend- 
ency on the part of humanly frail readers to take into account too 
many connotations implied in the contexts of manuscripts in attempt- 
ing to evaluate the pure denotations evidently expressed. To put it 
another way, there is a tendency to give the benefit of the doubt to 
any manuscript which contains cleverness, humor, or penetration, 
even though a comma may be misplaced or a modifier may be k)ut 
of its accepted bounds. Such an error may be due to the student’s 
inadvertence. Since the composition teacher is primarily interested 
in the excellence of the mechanics of language, particularly in the 
elementary courses, there is also the necessity of giving high value to 
the manuscript which is free from language errors, but which contains 
a alight modicum of cogence. * 

Reflection upon the material thus roughly indicated determined 
the steps which led to the construction of the objective aptitude test 
now in use by the English bureau. This test consists of four parts, 
each of which con tains 100 possibilities of success, and each of which 
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IS intended to measure a single, indispensable phase o( language 
aptitude. ^ 

Part I IS devoted to finding out M’hether or not the student can put 
the nght word in the right place and wj^ther or ndt he can spell that 
word correctly, if he does know it. It is quite true that there is no 
correlation recognized between a pure alylity-to-spelJ and language 
sense, yet it is most dcsirable’that words not only have but one defini- 
' tion in a siiecific context, but that they have a standardized orthog- 
raphy. To the extent that the student is able ttTdetermine the right 
word for the right place in this part of the test, we are inoasuring pure 
languagf aptitude; and to the extent that the student spells the°word 
correctly, we diagnose his ability to spell. 

The instructions for Part I are as follows: 


Make each 


Fill each blailk with a word accordirkg to the sense sugjjested 
word fit as accurately as possible. 

Example: ThanksKivitiR comes tlie last Tliursday in X 

' (Xarne of uitmlhj 

It will be observed that the stress is laid upon the aptitude aspect 
of this test and that there are no elements of confusion offered in the 
diagnostic aspect of the test. Most written spelling tests either spell 
the word correctly and incorrectly and ask the student to make a 
choice of the correct spelling, spell some words correctly and others 
incorrectly and ask the student to write the correct spelling of any 
incorrectly spelled words in a convenient blank, or ask the /'student 
to supply a- few troublesome letters in words which are almost entirely 
spelled out m contexts or in columns. In any case, the tendency of 
all tf^se tests is to call especial attention to difficulties and to give 
the least chance for common memory habits to assert themselves. 

The worjkin Part I are arranged in order of difficulty. Those at 
the first df the test are implied ntost obviously and are most easy to 
spell. The very fact that the first part of the test is extremely easy 

13 an encouragement to the student to proceed with expectation of 
success. 

The predictive power of Part I is appro.vimatelv doubleThat of any 
other part of the test for the purpose of determining language aptitude. 

Part II attempts to measure, the student’s aptitude for the use of 
correct idiom. It might, perhaps, be more accuratelv called a usage 
teat. The instructions at the first of it are as follows: 

In the following composition 100 words or phrases have been underlined. 
»ome of these are incorrect and some of them are correct. Draw»a line neatly 

and clearly through the incorrect words or phrases. Make no other.marks upon 
the pap>er. 

Example: He amt gping- homc . 

The words and phrases selected have been chosen from evaluated 
test material used over a perj ^ ^f years and are arranged as nearly 
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as possible in the order of difficulty. They also appear in a normal 
setting as part of a conte.\t. C'onsideralile difliculty was encountered 
in huildinp a simple, easy-llownnp composition which conformed to 
the necessary specifications. The composition had to he or<ranic and 
na-tural, and it had ^ contain tlje GO incorrect usages and 40 corr^'ct 
ones iiforderof diliiculty ns nearly as possible. The predictive p<rwer 
of Paift 11 is approximately one-half that of Part 1. 

Pai^ 111 is called runcluatiofi. It is\livided into three sections, 
with Instructions ns follows; 

Section I. In the following compo8itic)n 20 hinnks oonir, A prriod is*^tho 
appropriate mark of punctuiij^ion which should ho placed in sojuc (not all) of 
(hose blanks. Place an X in the hlanks where yon think a period helongs. 

Example: Come to the house at 4 o'clock X we shall have tea then 

X 

• Sec. 2. hi the following comfiosilion 20 blanks occur. A comma should be 
|)laced in some of them, a semicolon should he placed in others, aiul some should 
he left blank. 

Example: The red blue > and yellow of the sky blended beautifully 

I tlic sun seemed to be the center of a groat • glowing vojiex. 

Sec. 3. In the following passage you will find blanks in winch certain marks 
of punctuation should be inserted. Sometimes several marks sliould he put into 
the same blank. Notice carefully the difTereut single choices wliich may also 
occur in combinations: 

h Leave the blank open if no punctuation is required. 

2. Use X to signify a period. 

3. Use a cpmma according to the rules for the c(unma. 

4. Use opening and closing quotation marks (“ ”) arfuind direct quotations, 

5. User a question mark after questions. 

6* Use 1 to indicate a new paragraph. 

You ace not to change capitalization* or rnakt* any other marks upon your 
paper. . ** 

Examples: “ perliaps it is the very simplicity of the thing wliich put« 

you at fault said my friend X “ what nonsense you do talk! * replied the ' 

inspector . laughing heartily X 

The pimctiintion tested for in this part Hiay be called fimctionar' 
in the purest sense of that term. None of the arbitrary punctuation 
marks used according TiT^onvention alone appears. A student may 
learn by rote such convomions as placing the period after abbre^a- 
tions, the comma betwg^^the number of the day of the month 
the number of the year, in the same way in which he learns to spell 
a word correctly, but he must have a certain aptitude for language 
before knows that)a period comes at the end of a declarative sen- 
tence or that a co^ia is placed after an introductory adverbial 
clement. The predictive power of thtV part of the tosf lies between 
that of Pari I and Part II. ^ ^ 

Part III was made in sections in order to get some kind i)f order 
of difficulty/' Section 1, it is observed from the' instructions just 
quoted, demands the use of end punctuation onl^M^Tl\e correct use 
' 107121—32 6 ^ 
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of the period in this section niesns that the student has sufficient 
language aptitude to know when a complete thought has been ex- 
pressed and another thought begins. In section 2 every effort was 
made to present undebatable uses of comma or semicolon. In sec- 
tion 3 a passage from a standard edition of one of Poe’s tales was 
taken almost verbatim, because the greater our variety of choife in 
pimctuation becomes, the more complex the context in which it 
appears, the greater is our tendency to differ v^nth each other about 
the loci of marks of organic punctuation. The English bureau wi.died 
to l>e absolved from as much argument as possible. 

> Part IV is called Grammar. The instructions at the first of this 
part are; 

In the following compo.sition 100 words or phrases have hecn underlined. 
Some of these words and phrases are correct and s6tnc of them are incorrect. 

Draw a line through those words or phrases which you believe to be incorrect. 

Do not try to improve upon the composition. Make no other marks upon 

the paper. * 

Ex.\mpi.e; There is four men in the room which adjoins the library. 

It will be observed from these instructions that Part IV follows 
th^ technique employed throughout this test: The presentation of 
every item in as natural a contextual setting as possiMe", so that the 
student will have every opportunity to make maximum use of his 
language habits, whether or not he knows a single formal rule of 
grammar, because it is the' purpose of this test to measure the degree 
of refinement of language habits, or language aptitude. This part 
of the test was lowest in predictive \-nlue and was arbitrarily rated 
! at unity. 

! ' This completes a recapitulation of the theory upon which the 
objective aptitude test has been constructed, an(f a description of 
the physical form in which it appeared. No matter how long the 
problem of predicting success in English composition is pondered in 
theory, and no matter how beautiful the solution appears on paper, 
the actual use of the test is the ultimate answer to the question; 
Has the instfument practical value? The next step was to give th^ 

test. * 

The statistical department of the university was called upon to 
give its assistance at this point, and the research was placed under 
the direct supervision of Ralph’ Lei^htpm The reliability of the test 
was fSund to^be surprisingly high, '^he only flaw' seemed to be that 
there was no criteriem w-ith which to compare an apparently effective . 

Most composition teachers base their estimates of student achieve- 
*ment upon their rating of themes written hy the .student, but the 
variation in these ratings \)y differeijt fhstructors who read the same 
*• themes — and even by t|^e same instructor when he reads the same 
i thdme more than once— is notorious.. Several reputable composition 
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scales were .investigated ^nd rejected because they did not afford 
sufficient control of the factors involved in the problem of obtaining 
highly reliable judgments. Research upon the evaluation of essay 
type examinations indicated that a sufficient nunxber oT ju^gipents 
tended to correct the errors in each other, if these judgments w'ere 
axeraged in order to get a general merit score to represent the value 
of student work. \Ve therefore devised a score which w'ill be explained 
below, to ser\-e as the means of gaining a reliable estimate of themes 
which would be written by a group of students at as nearly as possible 
the same time, and under the same conditions, as these students 
would take the objective test. The evaluation of these themes would 
become the criterion \vith which we should compare the objective test. 

In evaluating each theme, the following equation was used: 

lo 

It is explained in this manner: 

CJM = general merit. 

1F= 1 weight given to a score of 100 points for a pa|>er perfect in physi- 
cal form. In estimating form, only the appearance of the paper 
i^ scored! 

2(i = 2 weights given to a score of 100 points for a paper perfect in all 
the mechanics of idiom, grammarr, and correct sentence structure. 

2R — 2 wejghts given to a score of 100 points for a paper having the most 
artistic and skillful use of word choice, sentence structure, and 
rhetorical excellence. 

5C — 5 weights given to a score of 100 points for a paper presenting the 
best thought content and most logical organization of ideas. 

The denominator 10 is the sum of the weights. 

The use of this equation resulted in producing a criterion of «- 
treniely high reliability. The immediate explanation of this high 
reliabilit} seems to be that the last three members of the numerator 
absorbed any widely differing points of view upon the different 
elements judged, so that the general me’rit of the paper was not 
unduly affected by any reader failing to include all of the possiiile 
merits of the paper in the final score. 

Late in the spring term of 1929-30 a group of 291 high-school 
seniors in Eugene High School and the University High School' was 
given the objective test and asked to write a' theme upon a subject 
presented to them. - . ^ ^ 

The objective test was carefully and accurately, scored by trained 
clerical. assistants, and the criterion was scored by two Portland high- 
^hooj teachers of excellenUtrainin^ and experience. Kliss Geraldine 
Cartmell and Mrs. Katherine Dili6. 

The reliabUify of the objective test was found to.be 0.93; that of 
the entenon, 0.88; the coefficient of correlation between the test 
,anid the criterion, i).50. This, correlation wjjas three times as great 
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as the correlation between the criterion and the next hiiihest corre- 
lating objective test aiiionji three others used in the experiment. 
The aptitude test was later found to correlate 0.67 with the psy- 
chological placement tost. 

Although the test was statistically evaluated and pronounced 
sound in technique, capable of administration, and highly reliable 
and reasonably valid asrainst a reliable criterion, throu^di this pre- 
liiiiinary manipulation, the (piestion “Has it practical value’ was 
not vet adequately answered. The third step was to use the test 
with the enterirj^ freshman class of the fall of in.?W. Lijjht hundred 
and fiftv-eijrht students were p:rouped in percentiles upon the basis 
of scores made in the test under discussion, ^^e had the jiercentile 
ratimr for each student accordini: to his hi^h-school record and also 
accordinir to his reconl on the psycholopical placement test. This 
gave us courage to make an arlhtrarx' division between the seven- 
teenth and eighteenth percentiles of the English aptitude test, because 
we could use these other two ratings in addition to actual cla^s con- 
tacts to correct any injustices which might bo done. We therefore 
arbitrarily assigned loS students who were below the eighteenth per- 
centile to our course in English A, designed for the correction of 
faults in technical English. 

A study of the curve made by plotting the percentiles above the 
seventeenth would indicate that we could conveniently divide the 
group'which w as not assigned to English A into three parts. Those 
in the highest “third” should probably be placed in accelerated 
sections: those in the middle “third” should be'iii normal sections; 
those in the lowest “thinl” should be in retarded sections. The 
organization of our curriculum precludes a course in English com- 
jMai^on in the freshman year, hence it is impossible to 'give any 
data which might support the validitt' of our sectioning students 
above the eighteenth percentile of our English ajititude test. \N e 
have the data upon the lowest group only. y 

Let us review hrielly the known points with which we had to work 
• in attempting to predict results in this remedial course in technical 
English; (u) We had more statistical information than is given in 
this report. This ailditional information is available in the statist!-' 
cal department of the school of education, but it is omitted here for 
practical 'reasons, {b) Wo ha<l the psychol('»gical jilacement tO!>t 
^ percentile, ^lie higlj-school record percentile, and‘lhe English apti- 

- tilde test percentile for each student, (c) \\e knew from investi- 

gation that the practice factor in the aptitude test was practically 

nil. - . . m 

Our administrative provisions for conducting English A demand 

that we divide our subfreshman group ipto thirds and that we give 
remedial instruction to each third for one full term— more or less — 
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at the discretion of the instructor. Of students assismed to English 
A this year. 1 ainonp 158 was e.xcused from the class with less than 
ope term’s instruction because that student seemed to have been 
unjustl\ lield for the remedial course. This student was of foreign 
extrnetion and was compelled to cope with some lan^uaire difficulty 
on account of the difference between a mother tonpie and Enelish. 

The first half of the term’s work in Enirlish A is devoted to a 
diagnosis of lanfrua^re difficulties. The instruction is intended to 
refresh old recollections of previous teachintr and to increase the 
store of knowled<re of technical English. Dinirnostic testing is fre- 
quent and personal conferences are mm/erous. When a diairnosis is 
finallv reached, the student is informed and is s^l^'e^ work to^io whit'h 
IS intended to remo\e his weaknesses. ’There seems to be a direct 
relation between the J)^ycholog^lcal tost jiereeniile and the learning 
power of the student, because the greatest improvement is usually 
observed among those students who rank highest on the psyeholoiri- 
^ cal test, and the least improvement is usually observed among those 
students who rank lowest on the psychological test. This is^i case 
in illustration: Miss K had a psychological percentile rating of 0.83, 
a high-scho.ol percentile rating of 0.44. and an English aptitude per- 
centile rating of 0.1. j. After the diagnosis and remedial instruction, 
she rated equivalent to 0.00 on the English aptitude test. Or here is 
another typical case: Miss P had a ]isychological percentile ratins; of 
0.20, a high-school percentile rating of O.Sl, and an English aptitude 
percentile rating of 0.12. After the usual diagnosis and remedial 
treatment, she rated equivalent to 0..')4 on the English aptitude test. 

Additional data of similar nature are on file with the English bureau 
and may be consulted for further verification of certain parts of the 
conclusions about to be offered. • ' 

The following summary presents briefly the conclusions wlrich we 
may reach at this point in our investigation of the prediction of suc- 
cess in college English Composition, (u) The aptitude test now 
devised is thoroughly satisth^ctoiy^ for the purpose of segregating stu- 
dents for remedial instruction in technical EngliaJi. (b) There is a 
direct relation between thei^sychological test Score and the improve- 
ment from remedial teaching in” technical English, (c) It would 
seeiii advisable to use the English altitude test for the purpose of 
sectioning for regular college instruction those .students not held for 
remedial instruction, (d) The use of the English aptitude test is in 
perfect harmony with a philosophy of education which sets forth the 
desirability of achieving maximum results at all levels of iif#lligence, 
whether that implies graduation from college or not. 


HEMKDIAL MEASI RES FOR COLLEGE FRESHMEN 


J, DeWitt Davis * and Harold Saxe Tuttle * 


I. INTRODUCTION 


The problem of college failure due to nonpassing grades has received 
considerable study, and several factors have been isolated (11, 13, 
19, 20, 23, 28, 29, 30, 31, 37).^ To remove these causes there are' 
three general types of remedial work reported in the literature: 
(a) A careful study is made of each case, and work is assigned and 
supervised in such a way as seems best fitted to the individual in- 
volved. The program includes testingi:^ for diagnosis, conferences, 
some class work, and some individual coaching (27 , 29). (b) Personal 

direction with no class treatment, depending largely upon the inter- 
view technique (17). (c) Group treatment, including more or less of 

the values apparent in the other two. This has generally been carried 
on Uirough the medium of the so-called How-to-Study courses (5, 9, 
15, 16,25,26). ltis\o this last group that tlie work at Oregon belongs. 

II. EXPERIMENT IN REMEDIAL TREATMENT OF FRESHMEN 

The worlC^t Oregon might be termed preventive rather than 
remedial, for it is designed to anticipate the more common difficulties 
that have been found to exist in the work of beginning students and, 
by means of constructive reading, personal interviews, and student 
practice, to initiate siivh habits as may forestall maladjustment and 
eventual college failure. A 2-bour coiirse is offt^red during the fresh- 
man .year, under the title Freshman Orientation. This' course, was 
first ojjered in the school'of education in 1927-28, and has been con- 
tjailed for reasons: First, because of the values that it seentbd to 
*c^er after the first analysis of results obtained; and, second, in order 
to accumulate further data that might be useful in directing the 
•'future, course of such remedial work. The course is required of all 
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The following outline will give a better idea of the nature of the 
work required by this course; 

1. Synopses, readings, and class discussion on the wise use (*f time 
while in college. 

2. The actual budgeting of time for various activities, with reports 
at frequent intervals of records of a week’s time expenditure. 

3. Habits of study. \Miat are the physical requirements, external 
and internal conditions essential to good study? How can adequate 
study habits be built? What can one do to correct older habits that 
are not economical? How can one distribute his study over the day 
and week to secure maxinium results? 

4. Reading improvement. What causes poor, slow reading? 
Systematic records of regular drills for the measurement of improve- 
ment. 

5. Planning how^o increase interest»in a given subject. 

6. Three lectures on library procedure, each accompanied by a 
carefully assigned project, each project carefully marked for errors,- 
returned, and required to be corrected by'its author.^ 

7. Improvement of vocabulary. Value and methods. Readings, 
discussion, drills, and class quizzes. 

8. Note taking, from reading and from lectures j readings and 
discussion with actual drill. 

9. Suggestions for better reviewing; plans"*for review submitted 

and discussed, and later reports on how certain subjects were ' 
reviewed. ^ 

10. Study and discussion on how to keep fit physically and 
mentally, with help in self-analysis. 

11. The importance of proper social adjustment. 

12. Preparation for examinations j methods of preparation; tvpej 

of examinations. \ 

13. How to build up a good bibliography; actual drill required. 

14. Preparation of term paper. Choice of subject, data, treat- 
ment, outline, mechanics of a good paper. 

15. The physiology' and psychology’ of learning. How arc habits 
formed? Poor ones replaced by good ones. Aids to memory, proper, 
distribution of drills the value of appreciation, the place of imagina- 
tion in adequate learning and Kving, reasoning in its various forms a«d 
applications, its common enemies. 'Exercises in self-expression, ex- 
perimentation, observation of others, reading,' taking notes, and 
making class reports. 

16. The third term’s work which is given to the education majors, 
but not included in that offered the nonmajors, consists in the main 
of a preview of the larger divisions of college course.s from which 

• Ml« CMiprd, •HiBUnt IlhTBriM st Ore*on. tiiperviBod (hit pwt o( the propam. 
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pi;^spe( tivc toacliers must select teachin<; norms. Reading drills and 
nc^e-takint exercises are continued, their application being made to 
the follownjg general topics: College requirements; physical scien«^; 
biological sciences; social sciences: hmglish ;-foreign language; physical 
educatioh; music; philosophy ; expressional activities, as public speak- 
ing, dramatics, story writing, and art. A provisional student program 
for the remaining three years of college work is ]>lanned by eatdi 

member of the class. =• 

This course must be carefully distinguished from survey courses'in 
literature, natural sciences, and social science, which are commonly 
called orientation co\irses. In order to keep this distinction clear, 
Education 111, Orientation, may, for convenience, be called the 
How-to-Study course. 


III. EXPERIMENTAL STUDIES 

, Table 1 indicates the enrollment during the 4-year period of the _ 
t>regon experiment. 

T.\ble —Enrollment in “how-to-study" course by years and psychological 

rating qunrtiles 
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Each of these students has been matched, for the purpose of com- 
parison, against another wJio is not taking the course. These con- 
trols are selected .by the personnel department, care being taken not 
only to match them closely on the psychological entrance exapiina- 
tion percentile score, but to pair them closely on the matter of bigh- 
school record as well, which record is also entered as a percentile 
score, making direct comparison easy. Because of some evidence 
that gratling systems in general differentiate between men and women 
in favor of the latter (22), the pairing of cases has avoided this sex 
variability, matching only in the same sex. illustrate how success- 
fully this pairing. has been done the following tablfe has^beeiu>repared, 
which is a record of 76 cases wdio completed the first term of the 
, course during the year 1930-31. 
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Table 2. — Sevenfy-gix paired cases of how-to-sludy students of I9S0-S1 
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1 articular emphasis is justified hero, for this is the first extensive 
attempt to pair cases so exactly on both psychological scores and 
liigh-school grades. ^ (High-school grades aft* regularly transmuted 
into a percentile preparatory' rating by the personnel department.)* 
This care in matching cases has given direction to the study that 
would otherwise have been overlooked. 

Considerable data have accumulated in this period of time, and 
they are being carefully analyzed in an effort to determine in how 
far there are statistically significant differences between the experi- 
mcntals and the controls, and what the indications are that the how- 
to-study work was a causal factor in producing these differences; 
,and further, to ascertain in so far as it can be done, how values derived 
from this course are manifested in later college work. Some of this 
analysis is now available. 

On the basis of the first year’s work, using average grades as a 
criterion, the results shdw as follows; 

Table 3. 19B7-28 grade averages how-to-stuay^and controls 
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NoTF.-The (iats here reported vary in aliRht delaila from a latV analysis ot ihe spme work due to the 
fart that here all the (trades were used where later those of mUitary, physiralerlurstion. and personal hvglene 
were omiued. 

* If psychological scores are lnl«rpr«tc.| as iij.licaling niitlve Int^ll^tual ability, hlRh-school jjravlea may 
be interprets as reflecting, in considerable degree, hablta of application, persistence, and effort. 
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.The chauge of position from one of inferiority at the end of the 
* first term, to one of considerable superiorily by Ihe end of the third 
looks favorable for the experimental group; but when this differetoce 
is treated statistically it loses much of its significance. 

The change in relationship between the grades of paired cases is 
more significant. At the end of the first term there were 15 controls 
whose grade average e.xceeded that of their paired cases; by the end 
of the second term this number had dropped to 11, and at the end of 
the third to only 7. These results appear to indicate that the drill 
of the course, the suggestions for improved use of time, better reading 
habits, etc., gradually change the condition from one where the con- 
trols had a slight advantage to one where the experimentals had a 
clear margin in both grade average, and in a number of cases that 
exceeded their controls. This latter fact indicates the rather general 
effect of the work of the how-to-study group. These results appeared 
sufficiently positive to justify the continuation of the course. Data 
are therefore available for four years. 

IV.-DATA analysis. 4-YEAR PERIOD 

1, Effect as shown by grad^ averages . — Since the former workers 
have all utilized (5, 9, 15, 16, 17, 18, 21, 23, 25, 26, 29, 34, 40) the 
criterion of grade averages, in one form or another, that analysis was 
first made. The task is much more difficult and time consuming 
than the relating of it. To secure from the registrar’s office, to com- 
pute, to tabulate, and to compare the work of 178 students, with from 
•one to three similar control records for each of them, totaling 712 
records in all, each of which records covers from 1 to 11 terms, is a 
very large undertaking. This work is not complete to date, but 
enough has been done to give some significant indications. 

It is to be noted that grade averages have all been computed -by 
omitting both military and those physical education and hygiene 
courses that are required of freshmen and sophomores. ^This was 
done arbitrarily, because the' authors felt that grades in those courses 
would tend t© obscure real differences that might develop otherwise. 
/Since they are required of both groups, no injustice is done by omi^iijg 
them in the analysis. Further, no weight has been given to grades 
marked incomplete, though a superficial analysis (Table 6) would 
indicate that were these "incompletes ” included with their later 
assigned grade the average of the experimentals w-ould be enhanced. 
The controls have somewhat more of<avch grades, and allowing a 
slight discount ih grade value for tardiness, it is their average that 
would suffer. Ihan arbitrary value of say grade IV or V, or any other 
• were used, the same condition would prevail for a like reason, hence 
they were omitted. 
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The data presented in Table 4 below were derived by grouping 
into one large distribution all the students' re cords for each respective 
term of college work that was completed to th at date, counting as 
term 1 that school quarter in which the how-to-study course was 
first taken. The first three lines include the three years’ records, 
1927-28, 1928-29, and 1929-30, respectively. The fourth line is for 
1930-31, and the fifth is a comparison of the last term’s record of all 
the students who did the work in 1928-29 and 1929-30. The last 
line is a composite of all the grades involved in lines four and five. 

Table A.— Differences in grade averages between how-lo-study groups and controls 
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‘ The formiil.i used for this compulnlion was the following: 


» <rdifT \ — ^ 

' Compuied froPi IIoI/lnKer. .statistical Methods in EducaHon, Table 42. d 211 
> S line as foyl note I in Tallies <■«, 

The numbers involved in each comparison are fairly large. In each 
grouping the e.xperimental’s mean grade average X (column 3), are « 
consistently better than those of the controls C (ooluinn 4). This 
consistency of data tends to strengthen the probability that the cause 
of these differences in favor of the e.xperimental group is more than 
mere chance, even though column 7 would'^.allow some leeway for a 
chance factor in at leasrthe first two terms. /In the last line the latest 
available term’s grades only were compared in each of the 151 paired 
cases. Of these some were' first-term 1931 grades, some second, and 
so on as far as the eighth term. The difference in grade averages of 
columns X and C is small, only 0.134 grades, but the difference is 
clearly si^ificant as indicated by column 7. 

Another approach was made to determine whether thp differerices 
indicated were consistent from term to term for each year’s students. 
Table 5 sets forth these average grade differences. 
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5. — Com])aTin()n of average grades — first three terms for I9S7-2S, 
1 9 2, '>-29, 1929-dO 

[X - Fxponmpnlals; C — CoDtrols) 

FIRST TEH.Nf 


1927- 28. 

1928- 2^^- 

1929- 30. 


1927- 28 

1928- 29 

1929- :i0 

1930- 31 I 


1927- 28. 

1928- 2<). 
192 ^ 30 - 


Ycar 


SECOND TERM 


TUIRD TERM 
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5 

27 

3. 57.S 

3. 400 

*-0. 175 


1 3. 492 1 

3. 708 

-I-.216 

39 ' 

' 3.311 ' 

3. 400 

+ .089 

2!) 

1 1 
3. 385 

3.335 . 

-.050 

28 

1 3. 357 

3.544 ; 

-♦-.187 

’M\ 

1 3. 271 

3. 403 

-f-. 132 

• K\ 

1 

3.296 

1 

3.541 i 

-f . 245 





21 

3.3S7 

3.411 

4-. 024 

24 

3. 262 

3.604 

-f.342 

31 

3.310 

3.480 

■f. 170 


> TRj/biSl available jn-aOes of 1930-31 were compared; some were first term, some second terra aierages. 


This table is consistent wfth that. set forth above (Table 4). The 
negative difference in the first term’s work of 1927 was gradually 
changed by the third term of that year into a positive advantage. 
The beginning negative difference may be due to the fact that the 
matching of pairs was not quite so thoroughly refined as for later 
groups. 

To account for this consistent difference in favor of the c.\perimantal 
» group the suggestion has been made that the content of the course, 
' and the kind of treatment, is piyticularly -valuable for prospective 
teachers, therefore they would naturally profit more than other • 
students by it. Another suggestion has been offered, that a different 
type of student enrolls as an educatioA major, tiod also the grading 
system of tha't department is different.. Now, if such were the 
adequate explanation then theoretically some other group of non- 
education majors should not show these differences. This, however, 
is not the fact a.s indicated by items bearing footnote 1 in line 4 under 
second ferm of Table 5. Of this entire group, 83 in number, only 13 
were education majors, the rest belonged to* various other depart- 
ments (except law ). Yet the cases in this group show the largest 
difference Sf any group for the first or second term. Moreover, the 
difference as indicated by Table 4, is clearly significant statistically. 
This would seem to preclude all of the above objections and to point 
to the how-to-study remedial treatment as the causal factor. It is, 
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at least in tliis particular ros]wt, that the crroiips arc known to have 
boon treat^cd diirerently. 

1. rjjfect aa indicated by dijference.s in unsatisfactory grades . — Since 
JonXs (17), Prcssey (24), an(|;^P^'inon (21) Imve all emphasized the 
fact tliat their cxperimentals showed fewer cases of utisatisfactor>- 
scholarship as indicated by failures, comlitions, or probations, it was 
deetiied worth while to analyze the data at hand for corroboration 
or negaUon of Xhis emplnisis. Table fi sets forth facts discovered. 
Tlie entire college record of 149 cases and their controls were available 
and were examined. Sonic of these records covered one term, others 
t)s much us 1 1 terms’ work. In this analysis grades marked incom- 
plete were included as unsatisfactory, on the theory that such a grade 
indicaWs some sort of maladjustment, or inability that the student 
WHS unfable to remove on schedule time. 


T.^bi.k Cf.~L'7isaii.<f,iclory grniic.'i nf hnwan-ntuilii sludruls nmt nmtroh 
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2D 11 

3. SO 
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Z*j 

‘ "O j Oi O 

1 7 i;) 

1-24 

2. 20 




n nvoi ' 9 

92 1 101 

4.40 

4.80 




'These (fra-lM are not weighted for course hours. They are s.mply the number of such gra.les received 
XOTK.— Of a total of 4.1M courses only 19.1, or t.«3 per cent, were unsatisfactory. 


In the whole group and in each of its quartiles the number of courses 
carried by the X and C groups' are reasonably uniform, although the 
dtllerences in the three u^per quartiles may have had some bearing 
■on the number of unsatisfactory grades (columns 3 and 4). The 
control group (C) had more unsatisfactory grades than the experi- 
mentals (X groups) in the first and fourth (luartilcs, and in the total. 
Though this difference is not as laige as Book (5) reported, it does 
indicate a similar condition'. 

These data also emphasize the importance of tM statement that i 
there are other factors causing dropping out of college, for of-a total ' 
of 4,164 courses, taken only 4.63 per cent were not completed satis- ’ 
factonly. Lemon (21) presented data that show-ed 57 percent of the ‘ 
lowestMlecile as dropped out by the end of the first year. Here it is ; 
'indicated (columns 7 and 8) only 7 to 10 Qpr cent of' the.^257 grades 
turned in for the entire low-quartile group are of ‘i^atisfactory 

quality, and some of these records reach as far as the fourth year’s 
wqrfc. 
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Closer Study of. this table (columns 5 and 6) suggest that students 
respond to the type of work offered in the how-to-study treatment in 
a different way, depending somewhat upon their ability and habits 
. as indicated by their psychological and preparatory scores. 

Both the lowest and the highest quartilc shov? marked favorable 
difference; the low-group controls recci^^ng 34 pef cent more unsatis- 
factory grades, and the high-group controls getting 85 per cent more 
such unsatisfactoiy’ grades than the corresponding groups of expcri- 
mentals. Yet the data is not consistent, for the middle groups 
‘ favor the controls in this reject. 

Another approach gave a more promising lead, and one that throws 
new light on Pressey’s recent statement to the effect that students 
■ above the twenty-fifth con tile in ability profit most from such remedial 
treatment. In the preliminary analysis of 76 subjects carefully 
paired with controls there was some evidence that differences between 
percentile ranking in preparaiory grades and psychological sco^s were 
more significant with respect to improvement under guidance than 
were either of the scores taken separately or the two*bombincd. ’ 

( This would appear to mean that students whose achievements m 
j high school were distinctly lower than their psychological tests would 
•dead one to expect were helped quite decidedly. Students whose 
achievements in high school were higher than their psychological 
score w'ould lead one to expect were aided but little by the how-to- 
study course. 

This clue led to a fuller study of the relation between the differences 
of pefcentile ranking and scholastic attainment. 

A study of the effect of percentile differences brought to light the 
following facts; When the average grade^f all students in the lowest 
quartile were compared with those of their controls, disregardmg the 
. matter of spread betwieen the two percentile scores, it was found that 
there was only a slight difference — namely, 0.01 1 grade value — in favor 
* of the experimental group. This difference has no significance 
' statistically and appears to . corroborate Pressey's conclusion (26) 

, with respect.to low-quartile students, suggesting that work with them 
is too expensive to be justified in the light of insults secured. 

However, 55 other, case tccords .with their controls-were studied 
where the psychological^ score was .20 or more centiTcs below the 
' prepfiratory score (which, as already stated, may be thought of as 
a habits-of-study’ score). The mean was O.077 grades in favor of - 
the experimental group. This difference^, also is not significant 
statistically, though is consistently favorable to the experimental 
group. However, if only ability, os indicated by the psychological ; 
scores is important, then this difference should be sufficiently greater 
than that noted above to be, significant, for the grqup includes a large 
^ number whose rank is well above the lowest qhartile.. It appears « 
I that those students whose habits^ of study, as evidenced Iby their 
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preparatory score, rank above their own ability level, as indicated \ 
by the psychological score, are least affected by the how-to-studv I 
remedial work. 'A ' 

This surmise is further strengthened by the data presented in Table 
. 7, wherein the records of 151 cases and their controls were carefully 
analyzed. The last available term’s a,verages were used. These 
cases were divided into two large groups included in the data presented 
1 in lines 4, 5, and G-of Table 4. In each case the differences for the 
separate groups were higfhly significant and when niassad into one 
composite group of 151 cases have a surplus of significance. 

Kach of these 151 cases was put into oho of the following categories: 

1. Those w'hose psychological score is 6 centilcs or more grpater 
than their preparatory score. * 

: 2 . Those whose psychological scor^ is within 5 centiles of their 
preparatory score. ^ « 

3. Those whose psychological score is 6 centiles or more less than 
their preparatory scores. ^ 

The basic a^umption involved h(^e is that the psychological score 
is an ability index, and that J-he preparatory score is an index to 
habits of satisfactory adjustment; which habits it is t^e purpose of 
remedial work to build up. Theoretically, then, those students who 
are already working above their ability level should be helped the least 

As one examines the data set forth in Table 7 it becomes^ pparent ^ 
that, in so far as the experiment has progressed, the early surmise is 
justified that those students whose achievement record evidenced 
by preparatory scores is lower than their ability are aided most by 
the remedial work; and that those students- who are already working 
over their ability by the same index when they enter the course 
are helped least— if, indeed, at all. 

Table 7.— Companion of hbw-to-atudy and coitlToh in relation to psychological 
! ^ and preparatory scores » 
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This modifies the conclusion of Pressey (25) that the ineffectiveness 
/ of remedial measures is due specifically to low intelligence, implying 
; rather that it may be due to the fact that all with low mtelligence 
who gain admission to college have already developed habits of study 
I superior to the average of their abilities. 

V. READING IMPROVEMENT 

From the beginning of each year’s work, considerable emphasis 
was placed upon the importance of efficient reading. What are the 
mechanics involved in such reading? WhaT are the chief enemies 
to goou reading habits, and how may they be overcome? The problem 
here implied was made a matter of major study and drill. 

Careful records for three groups have been made from week to week 
throughout the period of drill. Each student, three times a week, 
reads some uniform material, at least 20 minutes, making an effort 
to apply the ideas gathered about efficient rapid reading, not skim- 
ming. From .^ach of these efforts he makes a words-per-minute 
reading check. Practically complete records for one term were 
secured from 70 students. The master group sheet shows only the 
weekly average of these three or more records. 

To stimulate interest in reading improvement very specific action 
was followed. The class was given reports showing actual improve- 
ments made by similar groups. This was done every week at the 
second session. The records were received the first session. The 
exact average of the group, the median, the high, and the low,- were 
also given on the blackboard in table form and each student was 
urged to keep his own record, indicating where in this total group he 
found himself from week to week. Moreover, from time to time short 
formal and informal reading tests and speed checks were given in 
class period. This served the double purpose of added group drill 
and of added records to compare with student reports. , 

' On the whole the data show positive gains not identical wdth, but 
comparable to, and in general corroborative of other remedial work 
in improving reading, such as that of Book (4), Remmers (30), i 
Pressey (27), and others. Table 8 sets forth some of these data, which 
were gathered from three separate groups, for convenience called 
A B, and C. A was a class of 33 noneducation major freshmen doing 
the work here reported in the 1930 fall term. B wm a group of winter 
term noneducation freshmen 14 out of 20 belonging in the low half 
of ability, including 5 belonging in the low decile. C was a group of 
chemistry students. Because of schedule difficulty this class (C) was 
divided, allowing more than the usual amount of personal attention to 
individual difficulties. When the daU of these three classM are 
thrown into one distribution, it tends to show a more reliable picture. 

It is clear, however, from Table 8 that Group B was benefitted least 
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by the reading drill. The fact that Group B was not given individual 
interviews involving reading diagnosis and suggested changes for 
improvement, as were the other groups, may account for some of the 
failure to respond as well as did l)oth other classes. 


Tabl* 9,.— Reading rate improvement of three hoto-to-eludy groupe, 10 weekt 
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4 
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• Words per minute, 

• Per cent. 


Table 9 shows the gains in reading speed made by each quartile 
when the data was massed. Emphasis was placed on improved rate, 
it being assumed from former studies (1, 30, 33) that comprehension 
.follows closely with the increased rate. 


Table 9. — Reading-rate improvement — tO weeke record of 70 caaee 


Percentile 

1 

1 Num- 
ber 

T’ 

Average words per minute 

Per cent 
gain 

First 

week 

1 Last 
1 week 

1 

1 

Gain 

1 

9 

3 

4 

5 

6 

0-J4 

27 

21 

13 

0 

232 
" 250 
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296 

306 

824 

273 

344 

.78 

73 

67 

48 

31.6 

29.3 
8X6 

16.4 

26-49 

80-74 ... ' 

76-100 

Total u... 

70 

240 

312 

72 

30.0 



Some facts stand out clearl}^ in these figures. First, as a group it 
read about 6 words per minute too slowly for college freshmen (38) 
when first measured. Second, as. a group under the remedial treat- 
ment given, it responded with a 30 per cent increase in speed arriving 
at 66 words per minute advantage over the norm of 246 words as 
established at Nebraska by Werner in 1926. Third, lack of personal 
interview or some other cause or combination of causes resulted in 
107r21-4» 7 
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less improvement on the part of Ofoup Fourth, all those students 
who appeared to be in earnest and interested in doing so succeeded 
in increasing their reading speed, while others with plenty of ability 
such as case Go, rating 66-62, h^an at 203 words per minute and 
'ended at practically the same point, 214. Another case Ek, rating 
91-91 in psychological and preparatory scores, respectively, began 
at the very inefficient level of 158 and ended at 280 words per minute; 
and Ok, rating 26-02, began at 173 and ended at 340, showing a gain 
of 167 words per minute, .or 98.2 per cent. 

Figure 1 has been prepared to indicate the spreading effect of 
remedial work in reading. It is of interest to note that the upper 
limit graph is the score made by the same individual while the honors 
at the bottom of the list, as would be expected, were shared by several 
different students from week to week. 

While there was considerable difference between the high and the 
low reading rates the first week (119 words per minute), yet the 
deviation from the mean w'as less than that of later records. The 
slow readers appear to be working hard to keep up, and the more 
capable ones appear to have been loafing. The slowest reader was 
63 below the mean and the best one only 56 above. 

WThen reading drill began this spread at once started to grow. 
The average of the whole class increased steadily, bat the greatest 
speed records were made by those who had the higher psychological 
ratings. (See Table 9.) At the end of the 10 weeks’ period the spread 
between the high and low per minute records had increased from 119 
to 281 words, showing a 236 per cent increase in spread, apparently 
due to the training. Of this total spread the greatest deviations were 
regularly above the mean of the group; so that in the last record, the 
distance from the average rate of the total group, to the lowest score 
is now 78, as compared to 68 words per minute for the first week’s 
record; and the distance above the average to the highest score is 
increased to 203 words per minute, where it b^an as only 56. 

Massing data as in Table 9, or graphing it as in Figure 1 , reveab 

laiger tendencies whicdi are indicative of value to be derived from 

reading training,’ but it tends to obscure more detailed facts such as 

those set forth in Table 8. 

* 

Column 5 of Table 9 indicates that ability quartile one, ranks 
second in the total group for words per minute gain in reading speed. ^ 
’The numbers involved are too few for any extended analysis, but it 
is intoresting t^bserve in Table 8 that quartile one in Group B showed 
an average gau of only 14 words, in Group C the beet gain of any, 
143 words, and in Group A, a gain of 58 words, only lower than the 
average gain for the whole group. In the hght of this extreme varia- ‘ 
tion, when one recalls the difference in treatment given each group'as 
before noted, is there not an implicaticin that improvement in read* 

I 
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ing rate depende not entirely upon ability as indicated by the psycho- 
logical scores but rather both upon the amount of efficiency when 
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both more adeqitate diagnosis and teaching, and allow little ground 
for blaming lack of improvement in reading entirely to lack of ability. 
In short, the top and bottom quartiles appear to need a different 
type of remedial treatment which, when given to each in the best 
way, may perhaps tend to cause each to approach its maximum 
capacity in reading rate. 

To test this hyjwthesis the whole problem of reading improvement 
should be restudied, first on the basis of differences between ability 
(psychological) scores and habit (preparatory) scores, and second 
on the basis of differentiated treatment for the lowest and the highest 
ability groups. 

It would seem that to utilize this ability made available in such 
remedial work, the length and difficulty of assignments involving 
considerable reading should be regulated so as to provide not only 
for the poor readers where it appears frequently to be placed, but also 
to demand from the better students more of their ability made avail- 
able by such training as here provided. 

The question is asked, will this increase become a permanent 
possession? This question can be answered theoretically by putting 
another question. Reading is a skill habit; will any skill habit, e.g,, 
typewriting, shorthand, or piano playing, be retained unimpaired 
if it is not used regularly? Will it be used regularly if -assigned tasks 
do not demand it? Does one exert himself and maintain high effi- 
ciency when necessity does not require it? Does college study put 
the most rapid readers on their mettle? « 

VI. VOCABULARY DRILL 

Words are tools to aid in shaping social adjustments. Words 
become surrogates for large blocks of past experience when one 
learns to use them efficiently. In this sense words are keys to the 
treasures of the past, and talismans to the secrets of the future. 
Upon such a theory of word v*lue vocabulary drill has been recently 
included in the how-to-study remedial program. 

Lack of word understanding manifested itself in different ways. 
In the personal interviews related to reading diagnosis several 
subjects, when asked why their eye movement r^ressed, replied 
that they did not get the meaning involved because of some new 
X word or word usage. 

* It has been well established that regressive eye movements are 
one of the several causes of inefficient reading. One may conclude 
then that as a part of the reading improvement such vocabular^rill 
should be encouraged, for as words become famihar proper mdmiing 
is derived more quickly and ’jbss flitting back of the eye is necessary. 

Somewhat of this lack of word understanding may be indioaled 
by quoting a few student responses given on various quizzee: 
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Choleric: Beinf ignorant about. 

Chimerical : Quick tempered. 

Cozening: To reason earnestly. 

Gregariousness: Quarrelsome. 

• Ambiguous: Gigantic. 

Fallacious: Condescending. ^ 

In fact, a' collation of some of the student choices in best meaning 
put together would be fair material for “College Humor" or 
“Facetious Fragments." 

Lists of 70 words were assigned each week. Students were advised 
that a random selection of 20 words from these lists would be given 
in a quiz every 2 weeks. The type of quiz was explaiined, and a 
sample given, all of which placed the emphasis not upon single 
synonyms, but upon larger meaning content and usage, for the 
student never knew just what part of the meaning pattern would be 
used in the quiz. 

An effort is being made to determine the relation of the 'scores 
made on the vocabulary tests to the average term grades of each 
student and to the grades in specific subjects as compared with the 
control averages and grades. This analysis is not complete, however, 
at this time. As indicated by the appraisal of value by students it 
would seem to be a useful form of remedial treatment. 

VII STUDENTS’ APPRAISAL OF VALUES 

In Ohmann’s study (23) the suggestion is offered that the effec- 
tiveness of the treatment given throughout his course could not be 
accurately measured until some time later, but he added, “A sub- 
jective conviction of its value came perhaps most forcefully from 
the expressed appreciation of individual students who had been 
helped " In this Oregon study definite effort was put forth to secure 
and tabulate such student expression and to evaluate it. 

Procedure : At the close of the winter term the following blank was 
put in the hands of each student who had be^ in this course since 
it was first offered in 1927. 


Ephemeral; Elffemin&te. 
Decorum; Belief. 
Expostulate; To eradicate. 
Hedonic: Unfortunate. 
Grovel: A mere trifle. 



COPY OF LETTER FORM 


' (School of Education letterhead) 


March 3, 1931 

No 

One time member of Orientation Class, Education, 111. 

Dbab Friend: 

We are making a survey of the opinion of various students as to what each 
one thinaa are the most lasting values derived from certain courses in college 
work. You have done some study in a Freshman Eklucation Course No. Ill, 
called Orfentation. In view of its bearing, as you look over your whole college 
'eareer, csm you name five elwents, things, .or phases of that course from which 
you derived some value whicn was of more or lees help either in your studies or« 
in your general adjustment to daily life? Now as you look back over that oouiee. 
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select the one of the five which you rate at present of greatest worth, and assign 
it the value 5; similarly score the others in descending order, assigning the value 
1 to that which you estimate as lowest. 

Your reply will be treated « confidential; your name need not be signed 
unless you prefer to sign it.' Prompt return of this sheet will be appreciated, 
and will exp>edite our study. >■ 

Write your five items here: 


Aatigtied value 


Further remarks: 


Sincerely, 


The form was carefully prepared to avoid any suggestion, and yet 
to require emphasis only on the values which were outstanding to 
the student at the time. Only five such values were asked for, think- 
ing that such a limitation might make the emphasis more indicative 
of actual conditions. 

Returns were received from 97 students, some having had the work 
of the course as early as 1927-28. Data were tabulated as indicated 
in Table 10. 

Tabls 10. — Siudeni evaluations of the course 


ValuM Indicated by itudants 

Reis- 

cive 

rank 

S first 
place 

4 seo> 

ond. 

pi<4^ 

3 third 

2 fourth 
plaos 

I flrib 

plaoa 

Per 
cent 
of re- 
pliee 

IfTeighted 

values* 

1 

% 

3 

' 4 

5 

n 

7 

8 

3 

Tralnlnf and drill In scheduling time 









for study, analysis of time sxpendi- 







74 

308 

tore 

1 

34 

23 

11 

0 

ft 

ImproTliif reading ability 

How to uae the library mors effl' 

a 

17 

22 

30 

0 

8 

78 

266 

dently 

1 

10 

12 

4 

10 

10 

47 

140 

Intersat in increasing one’s Tooabu- 





13 


06 

137 

SIMS’ improvement levned bow to 

4 

9 

0 

10 

IS 








108 

improve my metboda of study 

6 

13 

6 

4 

1 

8 

3S 

Oeoew content of readinf materiil 

■mifiied. 

How to make better notes end bow to 

6 

10 

1 

8 

4 

8 

34 

SO 

use tbsBL 

7 

t 

1 

8 

3 

8 

30 

Oft 


• Th«w«ifbiT»hMi lo oolnanf wm wnpatod UMmn oC«DofUi«pro<liiotaof Ibtaombcr 

" ditiKiaiitfDy ib« wilghi to m6b rsiao l odki l td. 

> MofI iB if Ihi ftlam wm iIgMd. 
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Values indicated hj etudenta 


How ^0 review end prepare for 

quizzee and examinatioiDi 

Sometblag learned about the phyel- 
olocy end peycbology of oollffe 

work 

Unclaasifled mboellaneoue vdues 

Drill Id matbexnatlcal problems ^ 
Survey of dl/Terenl fields of poesibie* 

college courses to follow 

of genera] aid lo all oollege work. 
Drill work on equations and formulas* 
Guided tow ard vocational cholcje, or 

^ded In knowlnf how to chooee 

*^PaniUon of term papers, imch- 

tuque, etc 

Personality Improvement.., 

Use of memory aids, etc " 

Created Interest; learned how to in- 
crease Interest 

Taking part In discii^on 1!. 

Cultivated habit of promptness 

Improved my mental health 

Learned how to compile a bibliogra- 
phy; the mechanics of It 


Relft- 

Ut« 

rmnk 

5 flrit 
place 

4 sec- 
ond 
place 

3 third 
place 

2 fourth 
place 

1 fifth 
place 

^ 8^ a 

m__ 

a 

3 

4 

5 

0 

7 

8 

8 

2 

3 

8 

4 

1 

10 

0 

3 

1 

5 

4 

6 

20 

10 

2 

4 

4 

3 

6 

10 

11 

3 

2 

3 

2 

1 

11 

12 

0 

1 

4 

3 

3 

11 

13 

0 

2 

3 

2 

2 

0 

14 

1 

2 

1 

2 

1 

7 

15 

1 

0 

3 

• 2, 

0 

6 

Ifi' 

0 

2 

2 

0 

2 

fi 

17 

. 1 

1 

1 

0 

1 

4 

18 

^ 0 

1 

2 

1 

0 

4 

10 

0 

0 

3 

0 

0 

• 3 

20 

1 

0< 

0 

1 

1 

8 

21 

0 

1 

1 

0 

0 

2 

22 

0 

0 

1 

0 

1 

2 

23 

0 

0 

0 

1 

1 

2 


Weighted 

values 


U 


48 

46 

87 

25 

23 

21 

18 

18 

13 

il 

0 

8 

7 

4 


* These replies ail came from the chemistry how-to-study section. 


Vaiue 1 . — The item receiving first place, mentioned by 74 per cent of 
the returns, relates to economies in the use of time. The value, as 
they indicated it, be^ in actually scheduling one's' own time, rather 
then merely in reading about how it is done, or in discussing the 
subject at class period, e. g., subject 19 gave “Planning for the week’s 
work second place and put improvement” in parenthe^s, indicating 
that he had noted his own progress. No. 9 of the 1929-30 class gave 
“arranging a study schedule and having definite study habits" first 
place. Another student (No. 36 of the 1929-30 group) said : "Taught 
me the value of a definite schedule, not only in school but also out.” 
If this item is of chief value, as seems here indicated, this fact may 
in part account for the similar results that are achieved under appar- 
ently different treatment. Essentially every attempt at remedial 
work has emphasized this phase of its program. L. Jones (17) gave 
help on the bssis of “constructive individual guidance — without wait- 
ing for difficulties to arise to initiate such assistance.” A major part 
of this guidance is related to economies in the use of one’s time, as 
is indicated in the following paragraph : 

The tima chart*, uaed to reveal to the atudent and to the writer the amount 
oi time actually given to atudylng, enabled the writer to advise discriminatingly’, 
and the atudent to schedule adequately, the amount of time needed for his studlea 
in order to improve bis record. Of the four variables — native ability, time, study 
methods, and gradea achieved in a atudent’a career, he can ezeroiae more con* 
trol over the use of his time than over any other one. • • • (17). 
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As measured by the five criteria selected, Jones found his experi- 
nfental group significantly superior in each respect to the controls. 
His emphasis is the same as that at Oregon with respect to this value; 
the treatment is different, his being the more expensive case rhethod, 
the method her^ that of group direction. Though the methods differ, 
there is a common point of emphasis on self-analysis in the expendi- 
' ture of time which seems to contribute toward similar results in im- 
provement, and which impresses the students with its value as herein 
indicated. 

Valm 2 . — In the light of the experiments that' have been reported 
relating to reading improvement, the ranking of this ttem in second 
place is easily understood. More students included it as one of the 
five chief •^alues than an^^ other item; but more of them placed it 
lower down in the scale, second or third, so that when properly 
weighted it became a close second — 259 as compared to 263 for value 1. 
The improvement in reading, as set forth above, offers some explana- 
tion of this student evaluation. 

Value 3 . — As estimated by these student reports, the instruction 
and drill in how to use the library more efficiently receives third 
place in value. This is not to be wondered at when one notes the 
tendency in college courses to depend more and more upon current 
magazine material, and a diversity of authority rather than upon 
some specific textbook (10, 14). 

Value 4- — The high ‘rank of vocabulary training as a value derived 
from the course may require elucidation. In the first place the 
1930-31 chemistry group* reported with strong emphasis here, 11 
out of 16 valuing this part of theic w ork highly. In the second place, 
during this year greater emphasis in all of the how-to-study classes 
has been put upon actually building up one’s vocabulary, while 
fomierly only what might be done as suggested by various authors, 
was pointed out. At three times during the w'ork the students were 
called upon to hand in lists of words w’hich they had found in their 
reading and had added to their vocabulary. From these lists quizzes 
were prepared which stimulated the student to further effort. This 
greater emphasis may account for the fact that vocabulary building 
was assigned fourth place in the total group of responses, w'hereas it 
was not accorded any importance by the replies coming from classes 
prior to the last year when drill on vocabulary was introduced. 

Value 6 . — By some other more or less arbitrary grouping pf replies, 
putting every one that related in any way to imj^vement in study 
habits, learning how to study, better study as related to notes, lec- 
tures, reviews, general health, etc., under this heading, one might 

* In tbe ini wtntv tnmi an ettert/^rwM mad# to five a froop ol cbemiitr j atudenta the aame sort of reniio- 
4kl trantment* bat (ocaalnf tha drill ipedAcally oo chamiatry oonta^t, e. library li(rk waa aalgnied oo 
U)ykM mi iiitwmt to cbmn birr MtatimtM, laadipf vm «Muracad on cbanUitry ihaterial, chenilftr y probkma 
wmw Maltniil ior hindamantal aritbanetic drill and vocmtmJary balding waa aUin the field of eh a m irt r y» 
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have found this item in the most important position. By cla^fying 
many of such%iore general replies under other appropriate headings 
the emphasis still holds this item to bo one of the first five ‘Values 
derived from the course. 

Miscellaneous; unclassified values, — It has been suggested ^ Ufat 
there may be a delayed benefit possible in such work. Some of the 
replies classified here as miscellaneous add weight to this suggestion. 
No. 45, now a substitute teacher, places the following emphasis: 
Value 5. The course ‘‘enables a grade teacher to begin (pupil) 
study habits correctly/^ Value 4. It the teacher better under- 

standing of a pupil’s errors,^’ No. 73, now a teacher in an elementary 
school, reports, ^‘I find that the study I did * has been a 

great aid to me. I am in favor of the ♦ ♦ ♦ courses, as I think 
they help one a lot,” ^ 

Another student, No, 55, attributed greatest value to personal 
conferences and advice received in them. 

The remarks written in on the returns from the students of former 
classes were, save for two exceptions, strongly i,n favor of offering 
such a course to all freshmen. Some of them are given below. 

Table 11. — TyjticcU remarkt from howAo-Btudy ttudentii'] 


Cm 

No. 

H&nklDg 

Piycho- 

logical 

PTBparm- 

lory 

7 

08 

85 

20 

72 

01 

2 

07 

00 

37 

23 

30 

0 

17 

00 


R«mArkfl 


"I think tbe couree wu very beneflcUU to me and I think it would be flood 
for most freabmen.'* . 

I have been able to plao ahead. • • • Do not leave it out." 

*‘It is a course which I believe would be benefleial to every freahman ** 
‘‘Every oolleire freabman should take the courae.” 

“OdentatioD is a help to any student as we are taught how to save our time 
and study moat eflbctively.’* 


The psychological and preparatory percentile ranks are given to 
suggest that these comments come from various levels of student 
ability and training although in general the tendency as here evi- 
denced, is for the upper half to recommend such work oftener than 
the lower half do. T^'may possibly mean that it is they who derive 
the most lasting benefit from the course as it is now being taught. 


VIII. SUMMARY AND SUGGESTIONS 

1. Summary . — A review of the reported work designed primarily 
to enhanco the chances of college success for freshmen shows con- 
sis teqtly positive results, as measured by- various criteria. • 

The technique of remedial work varies. One type is essentially 
that of ’personal tutoring. To this method some would raise the 

’ *. JoM, dlnetoi «( panoniMl nMarafa .t tlw UnlTanity of BofflUo, Bufhlo, N. Y. In « pt r- 

■ennl oamtpoadaDo»,'dnt*d Ju. M. 1980i to Dm H. D. Sbddon at Oman. 
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question as to whether the results achieved are worth the expendi- 
ture of time and money involved. Another type is that of personal 
guidance illustrated by the work of L. Jones at Iowa. A third type 
which has been employed most extensively, and to which the work at 
Oregon belongs, is that of group treatment. This type generally 
mclxAles in its program personal interviews and tests both for diag- 
nostic and motivation value, as do also the other two types of pro- 
gram. Reading, discussion, some lectures — generally combined with 
drill in note taking — and a large amount of specific drill are also 
typical of this last group. ^ 

Thus in content th^ treatment given by the three .is largely the 
same. Jones at Buffalo differs in his group work by crowding the 
remedial treatment into a prer^istration period. This procedure 
would seem to have some advantages and some drawbacks. It would 
allow all attention to be centered upon one thing, namely, remedial 
work. It would also be easy to segr^ate the expense and charge it 
to the students who do the work. Being of a nature which perhaps 
should have been mastered before college matriculation this may be 
a justifiable pn|cedure. It would also, as Jones points out, tend to 
aUow a few of the very least capable to drop out before registration. 

No reports of results achieved from purely lecture courses on how 
to study, were found in the literature. The authors of this report 
are utterly skeptical about such courses, if there be any ; for habits 
are built or corrected not by exhortation or by telling why and how, 
but by actually doing, by personally experiencing good toethods of 
study, b^ drill in better ways of doing college work, and by measuring 
one’s own progress in this development. 

In this report of the how-to-study work, members of experimental 
groups are shown to have a consistent superiority over the control 
groups, when this superiority is measured by average term grades. 
It seems from the data analyzed that this advantage is due at least 
> in part to the treatment received in the how-to-study program. 

Botli the analysis of average grades from term to term and of the 
students’ own statements as summarized in Table 10 indicate that 
some values are retained for later use. Table 4 indicates that aver- 
age grade superiority manifested itself more significantly the third 
term than the first or second. ^ But the most significant advantage 
was apparent when the average grades fok th^^t available term’s 
' r^rd, some as late as the first quarter of junior imik, were compared. 

j The how-^to-study group also made slightly fewer grades of condi- 
^on, incomplete, or failure t|ian their controls and continued to do 
so from term to term, though thes6 grades are too few in number to 
^ account for any large part of college mortality. 

Further analysis of the data as indicated in Table 7, column 8, 
ahowB that it is not morely the low quartile ability group that profit 
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litUe or nothing by the how-to-study treatment; it is nther those 
who appear to be working above their ability when th«y enter the 
course. This findin g may prove to be of value to personal advisers 
for interviews and guidance. | 

The results of reading drill are in general corroboratiye of those ' 
obtained in other studies; but the f^ts set forth indicate that to 
obtain a reading improvement at the' different levels of ability the | 
teaching and g;eneral drill method must vary, the low group requiring i 
specific, detailed patient drill, where mere suggestion may suffice for j 
the best students. The spreading, effect of thia rea ding drill is con- ' 
spicuous'. In the worirtwo types of motivation appear of especial 
value; first, personal difficulty diagnosis, and, second, regular nota- 
tion of the improvepient of the group as a whole, indicating the best 
and the poorest score from week to week. 

No analysis as to the effect of vocabulary drill is complete enough 
to indicate measurable values. Student emphasis ranks this work 
as being worth while. 

Two points stand out clearly with respect to drill in library work : 
First, that two departments can cooperate effectively in such a pro- ) 
gram of remedial treatment; second, in personal interviews and' in 
the replies tabulated, library drill was given a high rank as having 
value for student success. 

The student topliee may also be of service in suggesting where 
emphasis may be placed in remedial work with promise of greatest 
returns. 

One mf^or conclusion is evident from the data' presented, i^^tnely, 
whether determined byobjective statisUcfiil treatment or by subjec- 
tive student evaluation, the parts of the remedial how-to-atudy pro- 
gram Uiat seem to coutiibjbte most toward student success, consist 
largely of those things the student actually does, drills at, 

experiences, and notes progress in. At this point all studies agree 
thoroughly. 

In general it appears that the How-to-study work should be con- 
tinued at the coll^ level and made available to all freshmen, but 
that those fres hmen pariicularly whose psychological scores are within 
a few centiles of or g^reater than their preparatory scores should be 
encouraged to elect the work of tMs course. 

2. Suggestions . — Out of this study have arisen some problenis which 
seem to merit some furth^ epflHmentataon. 

The remedial woik for chemistry majors along. similar lines to those 
followed in the 1931 winter term might well be continued far enough 

eatabliah its value or lack of it. This .could be done by advising 
its election in the fall term on the part of freshmen .whose p^cbo- 
logical scores are distinctly highur than their preparatory scores.' - 
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CoDfflderable evidence indicates the value of the personal interview 
{ for both diagnostic and remedial work. To secure maximal vduee 
personal interviews require freedom' from interruption and consider- 
able privacy. Provision for both of these factors would no doubt 
increase the value of the work. 

A study of the relation that may exist between hours spent in study 
! per week, reading rate, vocabulary test scores, and psychological and 
■ preparatory scores is now being made by the authors. It may shed 
I some further light on the problem of diagnosis and remedial treatment. 
Vocabulary training with a large group in some specific field like 
chemistry or biology following the technique used in this study, 
with cases as carefully matched, is another promising lead for further 
research. 

The imphcation of this type of remedial work is an old one — that 
college education shoiild be not faculty centered or curriculum cen- 
tered but student centered, with a program in which all are encouraged 
i to function at their best. 
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GROUP HI— ADMINISTRATIVE MEASURES BASED 

UPON TEST RESULTS 

AN APTITUDE TEST AS AN AID IN ADMINISTERING 
LARGE SECTIONED COURSES 

A.B. Stoucan* 


INTRODUCTION 


For several years those in charge of the course in constructive 
accounting at the University of Oregon have been making a conscious 
effort to study out proper methods of instruction and accurate 
measures by which to gage the student s progress. ^ 

During the course of this program of experimenting co^derable 
time was given to the working out of an aptitude test which would 
predict within the field of accounting more accurately than the gen- 
eral psychological test given to all students entering the omversity. 
Such a test was finally devised. While it has never been u^d as a 
sole basis for judgment of a student's aptitude or for assig nin g him 
to a particular section, it has proved of a great deal of value in a 
number of ways as an aid in the administration of the ^urse. The 
purpose of this paper is to give an account, not of the trials and trib- 
ulations incident to building the test, but to certain practical uses to 
which the test has been put. 

It ahoold be made plain at the outset that this is no attempt to 
justify the use of aptitude tests, the value of segregation of students 
as to ability, or the use of a series of objective tests in measuring 
accomplishment. It is simply an account of the problems of admin- 
istration encountered by those in charge of the course in accounting 
at the University of Oregon with reference at certain points to the use 
of an aptitude test in attempting to solve some of these problems. 


ADMINISTRATION OF SINGLE CLASSES USUALLY SIMPLE 


The administration of many university classes is exceedingly simple. 
The professor in charge of the class is a specialist in his particulw 
fi6ld. The c1e 88 is compoflcd of & relfttivoly small number of in 
viduals who are supposed to be mtereeted in the professor’s specialty. 
In many instances they may bo expected to bring a reasonable amount 
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of interest and judgment to bear upon the material presented. In 
such classes questions of attendance, individual differences in natural 
aptitude, and even defined standards of accomplishment are of minor 
importance* to the instructor. He is inclined, to regard himself as a 
fountain of information, from which the student may quaff in amounts 
suitable to his, the student’s, individual taste and capacity. The 
instructor may even defend a certain lack of reliability of examina> 
tioDS and consequent grades on the ground that after ail the thing of 
real value to the student is what he actually carries away with him in * 
the way of new knowledge, inspiration, or ideals, and that the grade 
received at the end of the term is of little or no real consequence. 

In those many courses where the subject is taught to single classes 
by ap instructor whose main interest is in the field involved, and where 
the class enrollment is made up principally of students whose^ inter^t 
and natural capacities have induced them to explore that particular 
field of knowledge, the instructor is probably fully justified in devoting 
most of his thought and energies to enriching' the content of the course 
rather than to the manner of presentation or to objective measures of 
accomplishment. ^ 

GROWTH OF SECTIONED BOURSES 

Recent years, however, have brought about a situation in our uni- 
versities where questions of class administration assume considerable 
importance. There has been a remarkable growth in enrollments in 
certain courses of general interest. The enrollment in many courses 
has been artificially stimulated by prescribing them as foundation or 
“background” courses. There is a marked tendency to reserve any 
great freedom in electing courses until the junior a^d senior years, 
thus forcing students into courses in vrhich they have littte real interest. 
These factors ail tend to create a situation where the course must be ' 
taught in a number of sections and where the specialist in charge may 
teach only a limited number of classes. The remaining sections must 
be taught by graduate assistants or instructors. Questions as to the 
natural capacity of the student, and as to his interest in the subject 
may now become problems of great importance. A great variability 
in the teaching as between instructors may become evident. Differ- 
ences in individual standards of accomplishment may appear. Dif- 
ferences in objectives may develop according to the preparation and 
interests of the individual instructors. These things may cause a 
good deal of di^tisfaction among the students themsehi^. They 
also may result in a veiy unsatisfactq^ situation from ^e standpoint 
of the faculty, particularly in courm designed as proliminary or 
prerequisite to more advanced work. / 
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ADMINISTRATION PROBLEMS OF SECTIONED COURSES 

Problems of administering large sectioned courses fall rather natu- 
rally into three groups: Problems having to do with (1) the content 
of the course itself, (2) the teaching personnel, and (3) the student. 

PROBLEM OF COURSE CONTENT 

This paper will not attempt to dwell upon the first of these three 
groups. It is obviousJthat problems of content will vary with the 
particular course and with the individual institution. For example, 
in determining the content of the course in accounting at the Univer- 
sity of Oregon, it was decided to put the principal emphasis upon 
certain interpretive and managerial aspects of accounting rather than 
upon the acquirement of a bookkeeping technique. Such an objec- 
tive might not be at all suitable in certain other institutions and the 
content of the course itself would therefore need to vary with the 
objective of the institution. 

PROBLEMS OF TEACHING PERSONNEL 

The problems that have to do with the. teaching personnel are, of 
course, many and at times extremely perplexing. Any effort to 
secure anything like a standard result with a group of persons natu- 
rally as individualistic in their make-up as are college instructors, is 
bound to be a very difficult and at times an exceedingly annoying 
problem. 

If certain definite objectives could be agreed upon and if a measure 
of some sort could be applied which would indicate the progress 
made by individual instructors toward the attainment of that objec- 
tive, the results of such a measurement would provide a most illumi- 
nating and forceful argument in dealing with the instructor: However, 
in measuring the results accomplished by an instructor, one must 
first know the quality of material the instructor had to work with. 
If some measurement of aptitude Vere had, one might select “pairs” 
of students of approximately the same aptitude from sections taught 
by different instructors.^ A comparison of the accomplishments of 
the students of similar capacity but receiving instruction under differ- 
ent teachers would be illuminating. 

Such a plan was used at the University of Oregon as an aid in 
evaluating the quality of teaching done in the basic accounting courses 
during the fall term, 1929. There were 7 instructors who taught at 
least one section of accounting. Seven groups of 12 students were 
then found whose aptitude scores indicated that they were approx- 
imately equal. These groups had almost the same total aptitude 
and were almost exactly equal man for man. Each group was selected 
from the class of a ain^e instructor. These groups were built up as 
107121—82 6 , " 
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follows: A student with an aptitude score of approximately 104 was 
selected for each group, similarly another “pair" with an aptitude of 
■ approximately 99 was selected, and so on, 'until there were 12 pairs 
selected with aptitude scores ranging from 104, which was a relatively 
' high score, to 68 which was relatively low. In order to make the 
selection an absolutely impersonal one this pairing was done without 
access to the accomplishment (criterion) score. 

The aptitude scores of the groups is shown below. 
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The accomplishment score of each student was then determined 
and tabulated: 
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The accomplishment scores of the students in each group were 
*^aken as the measure of the teachers’ efficiency. This may be open 
to some criticism on the score that the groups are not necessarily 
equal in application, interest, and other elements which can not be 
measured by an aptitude test. The answer is that these factors are 
the which most test the teacher’s akill and that the factors 
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which' are no^usoeptible to the influence of the teacher are probably 
a minor oonsroeradon. Inatructors ranked as follows : 

\ 


The typical accomplishment of careful and efficient teaching is 
probably represented by the scores in Group II. 

A similar plan had been used in 1928, with a good deal the same 
sort of grouping as the result. In the fall of 1930, instead of picking 
pairs, the entire section was used as the basis feu* comparison, by 
reducing the aptitudes and accomplishments to' standard scores. 
The results were very similar, however, with the tendency to show 
that there were one or two instructors who did hot measure up to 
quite the same standard of result as set by the' main group. 

Such comparisons of achievement and aptitude are very helpful 
to those in charge of the course. One should state, however, that 
these comparisons are never used as the sole basis for judgment of 
the teacher’s efficiency but rather as important corroborative evidence. 

Two particular results have been noted from this use of the aptitude 
test. First, the evidence seems to support the theory *that it is the 
careful and painstaking teacher who produces the most consistent 
and efficient results, rather than the more brilliant but somewhat 
careless type. Second, there is a tendency for the teachers to reach 
about the same plane of accomplishment when they are aware of the 
results of such a measure. This last may be partially due to the 
opportunity it gives the administrator to put pressure in the right 
place. At any rate, the use of the aptitude test in this way has 
tended to equalize many of the usual differences in teaching. 

PROBLEMS HAVING TO DO WITH STUDENTS 

The problems that have to do with the students enrolled may be 
listed as follows: 

1. Should the student be in the course at all — are hU interests and 
capacities such as fit him for this type of work? 

2. Are there any considerable numbers of students whose prepara- 
tion or backgrounds are, such that they would do better work if 
sectioned by themselvesT 

3. Can one discover groups of varying abilities, and can such 
groups be segregated so as to make possible the development of 
methods and the adjustment of course content to better meet their 
needs? 


A with a 'score of 2,564 . I 

P with a score oi 2,393 
E with a score of 2,387 ^ ^ 

C with a score of 2,376 
G with a score of 2,360 
D with a score of 2,162 . 

B with a score of 2,088 1 
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SHOULD THE STUDENT BE IN THE COURSE? 

One of the most difficult situations which arise in connection with 
lar^ sectioned courses concerns the student who has no natural 
aptitude for work in that particular field. The problem of treating 
such a person fairly and justly becomes much aggravated when the 
course is a prescribed course which the student must take no matter 
how distasteful it may be Our usual treatment of such cases is to 
make the student secure a grade in open competition with scores of 
persons naturally better fitted to do the work of the course. If he 
fails to measure up to the standard set by this larger group of inter- 
ested and more capable persons, we give him a grade of F. The 
whole proceeding is about as just as requiring a one-legged man to run 
a foot race in competition with normal men, and shooting him if he 
comes in last. The use of the aptitude test reveals that many cases of 
so-called indifference arid laziness are really cases of a definite lack of 
ability. Unfortunately, although the use of an aptitude test may 
reveal the reason for failure under such circumstances, it does not 
suggest a remedy. However, the aptitude test does serve to reveal 
such a situation and challenges the conscientious administrator to 
work out some remedy. 

Segregation as to background 

Even with those students whose aptitudes are somewhat higher one 
finds a wide variation of background and interests. Such variations 
are not likely to be evidenced by the aptitude test. In the course in 
accounting at the University of Oregon there are about 50 women in 
the group of 450 students enrolled. It was foimd that 3 or 4 women in 
a group of 30 or 3^ men apparently were at a disadvantage. A sepa- 
rate section for women was arranged. The work of the young women 
seemed to improve somewhat, their mterest seemed to be materially 
better, and the problem of teaching was simplified. This experience 
has suggested some interesting fields for investigation. 

SEGREGATION AS TO ABILITY 

The most obvious use of the aptitude test would seem to be to uqfi 
the information thus found as a basis for segr^ating students into 
groups of similar ability, providing, of course, that such a segregation 
would result in a proper adjustment of the material given to the needs 
of the respective groups. 

Frankly, the administratorB of the course in accounting at the Uni- 
vftreity of Oregon have never found Uiemselves ready to rely upon the 
aptitude test as the sole basis for such a segr^ation. Rather, for a 
period of three or four yean, studrats have been sectioned in hetero- 
geneous groups during the fall term. At the beginning of the winter 
term and again ai the beguming of the spring term they are naeo 
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tioned on the basis of the instructors’ grades. The development of 
a more careful and uniform system of grading has tended to make the 
grades given by individual instructors much more equable and just. 

APTITUDE SCORES AND THE INDIVIDUAL STUDENT 

There are many other ways in which the aptitude score has been 
used. A particularly important one i» in connection with the personal 
contact between the instructor and the student. In this case we do 
not rely entirely upon the special aptitude score but take into account 
the psychological rating ^ven by the personnel bureau to all entering 
students together with a comparable score of high-school accomplish- 
ment, also compiled by the personnel bureau. Thus, if a student has 
a high score in his general abUity and a relatively low score in his 
high-school record, we have a fairly accurate idea as to his study 
habits. If his natural aptitude in accounting, as revealed by our test 
is low, we mudt realize we have a certain problem in teaching; he must 
be watched closely and made to keep up on his work. If rather 
exceptional natural ability in accounting is indicated, the problem 
will be to stimulate his interest and correct his habits of study. 

APTITUDE TEST AS A BASIS FOR JUDGING ACCOMPLISHMENT 

At the present time the special thought of those in charge of the 
course has shifted away from further perfecting of the aptitude test 
to the development of new methods of presenting the material. The 
present plan of segregating students as to ability has one fundamental 
weakness. This plan has been to have all students cover the same 
ground, but to so arrange the matter as to permit the more capable 
persons to go into each point more intensively. The way this is 
working out is that the more able sti^ent does more woilc than the 
less able. So far as the practical necessities of the situation are con- 
cerned, it ou^t to be the le^ apt student who does the most work. 
He is the one who most nee^ it. To correct such a situation may 
« require some very radical and unqsual practices. We are considering 
at present the setting up of an experimental group of say 100 students. 
7'he work of these students would be set up in “bud^ta,” in many 
respects not unlike the contract system used in the secondary schools. 
Periods for supervised study would be arranged. The student would 
be^ermitted jto carry on the work as rapidly or as slowly as his natural 
at^ty would permit, taking frequent objective quizzes and being 
required to repeat any portion of the budget which his quizzee indicate 
has not been thoroughly mastered. The remainmg sections would be 
operated mudi as at present. Here again the aptitude test will 
serve a most useful end in giving a basis from whicl^io measure the , 
relative effideney of the two methods of instruction. 
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SUMMARY 

The matters set forth in this paper should bo regarded as the 
account of an experiment which has been carried on at the University 
of Oreg^ for a period of several years. It is an expression of the 
views of those in charge of this work and is decidedly not to be 
interpreted as any eflfort to suggest that these views are applicable 
to any course or to every school. Those administering the course 
have found the development of an aptitude teat exceedingly helpful 
in several ways. . It has been an aid to them in measuring the eflfeo- 
tiveness of the various teachers; it has helped in securing greater 
uniformity in objectives and in teaching method; it has acted as a 
reenforcement to the instructors’ judgment in segregating students 
on the basis of grades; and it has served as a criterion in judging the 
accuracy of ourlteets. Perhaps most important of all has been its 
effectiveness as a guide to the problem of the individi^ student* 
particularly when used in conjunction with the ratings lor general 
scholastic ability and for high-school accomplishment. 

A most important though more indirect result in this particular 
instance has been the way in which the effort to perfect such a test 
h ftH awakbned those in charge to some of the really pressing problems 
of administering a large sectioned couree. 


o 


ESTABLISHING A STUDENT MENTAL HYGIENE CLINIC 

Othnikl R. Caambbus * 

The need, in an effective personnel program, of such a service as 
only a mental hygiene clinic could supply has been increasingly 
recognized since the‘ days of the World War. Since Morrison and 
Diehl’s study (4)* in 1924 the percentage of serious cases has been 
fairly well established at a minimum of 8 or 10 per cent. Doctors 
Riggs and Terhune (5) say 10 per cent in 1928. 

That it is not the poor student alone who is in need, of aid is stressed 
by such studies as the anonymous study (1) published in 1921 show- 
ing that some two-thirds of the Phi Beta Kappa graduates of one 
institution have shown signs of neurotirism or psychoneurosis. 

The chief contention has arisen as to the method of establishing 
the needed mental hygiene service. Certain^ritical factors must be 
considered in deciding upon the mode of attack. Some of these 
include (a) the groups in peed of aid, (6) the scholastic ratings of 
these groups, (c) the developnlent of wholesome campus attitudes 
toward the service, (d) the content of the course to be given as the 
basis of the «nental hygiene service, («) the attitude of the faculty 
toward the service, (/) ^he securing of maximal returns on a minim al 
investment, (p) the consequent use of all personnel smd equipment 
already available upon the campus, (h) securing the public’s interest 
in the venture in order to (1) educate the public and advance the 
mental hygiene progrant iiL general, and (2) secure outside support 
for the institutioaal program. 

Thq individuals needing aid rnttv be grouped in three groups: 
(a) The large group of studen^ who could secure sufficient aid from 
class instruction in^ mental hygiene.; (6) a smaller group whose prob- 
lems are of such a nature as to require more or less personal aid, 
analysis, etc.; and (e) a small group whose mental condition is too.> 
serious to justify care by the institution and who will have to be 
withdrawn and placed in the hands of psychiatrists with hospital 
facilities. 

— , . — 
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Afi you note, these groups are of rapidly decreasing size, of rapidly 
decreasing importance to the institution, and to the State at large, 
and of very markedly decreasing hopefulness from the point of view 
of prognosis. 

The first group is outside the 10 per cent which has always been 
stressed, and its exact or even approximate size is unknown. The 
second and third groups make up that much-discussed 10 per cent 
referred to in the oj>ening paragraph. 

® The second major question concerns the year of school in which 
the greatest need lies. It is rather evident that the problems will 
be more marked in transition periods and hence the freshman year 
would be e.xpected to yield considerable maladjustment. This is 
actually found to be true. It would seem then that the place to 
k begin is in the freshman year. 

The mode of attack is rather vital. Publicity is not desired as 
wrong impressions are nearly certiin to gain credence and the work 
be looked upon ^ work with “nuts” — the abnormal. 

As mental hygiene stresses prevention and much the laiger group 
needs only that service which can be rendered by class instruction, 
it is advisable to enter a wedge by the establishment of a freshman 
course in mental hygiene. This should, the w'riter believes, be made 
» an elective course. The reasoii for this is that the attitude requisite 
for mental hygiene aid and m^tal therapy is not secured by forcing 
that aid upon the individual to be helped. Moreover, the malad- 
justed in college are much more likely to be hostile to authority 
than are the well adjusted. The content of such a course has been 
• very adequately outlined by Doctor Blanton (2) and quoted by 
Bohannon (3). Of necessity that content must be adjusted to the 
local' situation. An instance may be given. At our institution three 
orientation courses are required of large numbers of students. One 
of these courses is “How to study.” This topic, then, which appears 
in the Blanton-Bohaimon outlines had to be dropped. Local situa- 
tions as regards the social significance attached to fraternity and 
sorority membership, the extremes between rural and urban culture, 
etc., will also alter course content or determine illustrative material. 

. Certain it is that the course should not be a course in abnormal 
p^chology and should not induce a morbid attitude on the part of 
the students. The appearance of, the cause of, the effect on efficiency 
* and mental health, and removal of such things as exaggerated emo- 

' tions, day dreaming, > homesickness, inferiority feelings, rationaliza- 

tion, phobias, unhealthy attachments, compensations, fatigue (mental 
and physical, etc.), should be stressed — not from the point of view 
of the abnormal but from that of their appearance — even in you 
and me. 

The development by each student late in the course of an auto- 
biogTaphical case study, and the giving of a number of teeU in the V; 
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course and their grading by the student gives the student an esti- 
mate of his own needs and points out to the instructor those needing 
personal aid. 

The granting of clinical aid to those desiring it— its use being 
purely optional brings in almost all those students needing aid. 

In the writer’s experience 10 per cent of/the classes came in during 
the early days of the course. Now more and more people i^eeding 
the course are electing it. Students advise friends to take the course. * 
Deans, instructors, and student advisors recommend it. Eventually 
the course becomes a more or less effective selective agent taking in 
a larger and larger percentage of those needing aid. Last term 17 
per cent of those taking the course came to the clinic for personal aid. 

The Thurstone and Thurstone psychoneurotic inventory (6) has 
been found extremely effective for giving in the class. Boys making 
scores over 67 and girls making over 75 are, in more than 98 per cent of 
the cases, in real need of personal aid. More than 85 per cent of 
boys maldng over 58 and girls making over 65 are in need of such 
aid. This test, while it does point out certain types of cases, does 
not point out others. A low score on this test is no guarantee of 
emotional stability. 

The Thurstone and Thurstone psychoneurotic inventory should not 
be given until rapport is established with the student, nor should 
interviews be proffered before fairly late in the course, for the same 
reason. 

This teaching of mental hygiene is not a mere incidental in a pro- 
gram of this type. It is the very groundwork, acting as a preventive, 
as a sales force to the student body and to the faculty, and bringing 
in cases for clinical study by an “endless chain” method. Its cost 
is light. 

In the establishment of the clinic, advantage should be taken of all 
available personnel and equipment. The instructor in mental hygiene, 
ahd the case workei^a member of your psychology staff — should 
beaft'ijiq^ official ^admimstrative position such as deanship, assistant 
deanEhip, etc., if he is to secure absolute confidence and also full 
information. Moreover he should have a broad toleration. While 
the case worker should, in the writer’s estimation, have a religious 
faith, he should not be very closely identified with any sectarian 
organization as such identification tend to prevent certain con- 
fidences. No case should ever be handled by a person whom the 
student’s problem upsets emotionally. 

It would be most unfortunate to have an out-and-out Freudian or 
disciple of any one school in charge of the clinical work. 

The use of thd staff perhaps can well be illustrated by our situation 
at Oregon State College. Work in speech adjustment is shunted to 
Professor Wells, the speech department; reading difficulties are 
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turned over to Doctor Parr; vocational guidance problems go to 
Professor Salser; physical examinations are made with espedal care 
by the physicians of the health service; aid in religious problems is 
secured from Reverend Warrington or from the pastors of local 
churches. 

The establishment of such clinical facilities should be quiet, non> 
advertised, nonspectacular in growth, and founded on mental-hygiene 
instruction in the freshman year. 
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TEACHER-APTITUDE TESTS AND TEACHER SELECTION 

Nkuon L. Bossinq ‘ 

I. THE NEED OF DISCOVERING BETTER SELECTIVE DEVICES 

FOR TEACHERS 

In the field of education we have become increasingly conscious of 
the need of a better type of selective device than the usual ones 
employed if we would choose and direct wisely those who should enter 
the teachiitg profession. When teachers were few our chief problem 
was to find a sufficient number of teachers to man our schools. That 
day is past. We are now confronted with a numerical oversupply of; 
teachere.. It is now a question of the selection of the best availabW 
for trainifig and placement. 

According to the Bm^au of E^cation Bulletin, 1929, No. 17, on 
Teacher Training: 

The number of students enrolled In all types of institutions which train teachers 
is more than one-half million. This is more than 400 per cent greater than the 
number undergoing training two decades ago. During the same period the 
number of teaching positions has increased by approxinutely 35 per cent. 

Again quoting from another Bureau of Eklucation Bulletin, 1929, 
No. 14, which discusses the number of teachers now in training for 
public-school positions throu|lbut the United States, we find this 
startling conclusion ; 

It is safe' to assume that these institutions (colleges and universities) are inter- 
ested primarfly in the preparation of teachers for high-school positions. If so, 
we face the probability that at this tune there is a student in training for every 
high-school teacher position. 

Professor Miller, of the school of education, Columbia University, 
in the June, 1929, issue of the High School Teacher estimates that 
we have an oversupply of 150,000 teachers. 

This, of course, is not the entire story. Another consideration that 
makes urgent better selective devices is the fact that among those in 
preparation for or now teaching, wo have a large number of misfi t e, 
people who have met the standards of preparation required for cer- 
tification and yet who find themselvea temperamentally or through 
other deficiencies incapable of successful work in the profession. 

• N«im L. Bortat, praam of edocatioa ud dtnetor g( lopwirMoo. Unltwtty of Or^. A. B, 
Kwmi Wwleyta UatTwritr. »IT; U. A., Ncrthwwtan Uniranttr. l«: Pb. D., UaffwStr oT Chle^ 
U®. H* vaa tamtly haad a( tha dapaitMDt of artnaattoa lad dirac t cr of aMatona at BtaaiMoa 

C^^ r ab S o a lh Mi : "HMoar «< Bdueattonal l 4g Bl a rtn B la Ohio Ban UN to MSa” F. J. iter Pub- 
lUhtat Oo.. Oahnaboa, Obte IW- 
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A third factor which gives emphasis to the matter of selection is 
the fact that professional standards in education have become quite 
rigid and indications are that the standards for the profession of 
education will soon approximate the level of some, of the recognized 
professions, in the quantity of training required. For example, we 
now require four years of training for the high-school teacherff" in the 
State of Oregon, whereas not a few years ago graduation from high 
school was sufficient academic preparation with which to enter high- 
school teaching positions. Our tw6 li^ighboring States — Washington 
and California — have each entered upon a 5-year plan of preparation 
for those who are to be certified as high-school teachers. Obviously 
everything that humanly can be done should be done to safeguard 
•both the prospective teachers and the profession against the prepara- 
tion of those who lack the necessary qualifications for professional 
success. It should be possible to prevent the unpromising candidate 
from entering upon such an extensive course of training. 

Still another factor which looms large in these days of economic 
depression is that of teacher-training costs. According to the United 
States Bureau of EMucation Statistical Circular, No. 11, on Per Capita 
Costs in Teacher-Training Institutions^ 1927-28, we find approxi- 
mately $300 the average cost reported for a year’s training of prospec- 
tive teachers, with wide variation for individual institutions of from 
$194.80 to $439.67. 

Against this general need which /fows from the increasing demands 
of the profession of teaching we must face frankly the inadequacy 
of past and present methods by which we have and do attempt to 
determine who are and who are not good teaching risks. 

II. ATTEMPTS TO DETERMINE TRAITS OR FACTORS THAT 
RELATE TO TEACHING SUCCESS 

f 

Among the earlier efforts to evaluate teachers was that of the rating 
scale. The use of the rating scale has reflected change of emphasis 
over a period of years. The first scales were used by school super- 
intendents and school administrators to determine the merits of 
teachers within their employ either for promotion or elimination 
from the system. Later the scale was looked upon by leaders in the 
profession as a device by which to improve the teaching power of 
the teachec herself. It is only within recent years that attention 
has been focused upon the desirability of determining traits or factors 
that might have prognostic value in determining who should and 
who should not become teachers. The attempt, therefore, to deter- 
mine specifically and as far as possible objectively the presence of 
measurable traits or fiMitora of teaching success or aptitude is relatively 
of recent date. Dr. F. B. Kni^t (6) * in his doctor’s study at 
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Columbia Univeraity recogniied three studies of exceptional value 
on this subject prior to that date. They were J. L. Meriam’s study 
on Normal School Education and Efficiency in Teaching, carried 
on at Columbia University and published in 1906 (1 1); The Develop- 
ment of a Grade Scale, by Dr. Edward C. Elliott, in 1910; in which 
he developed a tentative scheme for the measurement of teaching 
efficiency, later revised; and a third study was an extensive research 
by A. C. Boyce and published under the general title "Methods for 
Measuring Teachere’ Efficiency." (2) Prior to 1925 but some half 
dozen studies of significant value seem to have been undertaken. 

In addition to those previously mentioned. Dr. G. T. Somers’ Peda- 
gogical Prognosis: Predicting the Success of Prospective Teachers (16) 
another research study for a doctor’s degree at Columbia University 
in 1923, and the monograph of Dr. F. L. Whitney on The Prediction 
of Teaching Success, published in 19^ comprised the battery of 
studies considered of value (19). SinS^ that date some two dozen 
studies of varying degrees of merit on this subject have been made 
and reported in periodical literature or monographic form. 

For some time it has been generally assumed that scholarship was / 
a quabty that had large significance qs a means of predicting future 
teaching success. Unfortunately a number of studies that havd 
been made do not give us cause for great confidence in this factor 
M an instrument of prediction. For example, the results reported 
in a number of studies such as that of S. A. Hamrin’s (6) show a 
correlation of only 0.05 between school marks of teachers in training 
and the later rating;s of these same teachers by superintendents in 
the field. Dr. F. B. Knight (7) in the study previously referred to 
found a correlation of only 0.153 between ability to teach and scholar- 
ship. Doctor Whitney in his study found a correlation between 
academic marks and teaching success after graduation of but 0.073 
(20). Roy R. UUman in a study of the prediction of tea<^hing suc- 
cess carried on for his master's degree at the University of Michigan 
(18) found a correlation between teaching’success in the field and 
general scholarship of 0.30. 

G. P. Gaboon in an article in the May, 1930, issue of the Univer- 
sity High School Journal (3), reports a correlation of 0,065 ±0,06 
betwMn academic grades for one group of students correlated against 
practice t^hing success, and for another g^roup of the same year 
a correlation of 0,27 ±0.06. He, therefore, concludes “that there 
18 no relation between success in college as indicated by general 
college marks and success in practice teaching." 

He further states "that the factors of success in each of the two 
situations are somewhat different. In college the emphasis is largely 
upon achievement in subject matter while in the secondary school it 
is upon achievement in instructing pupils with subject matter more 
M means to an end." 
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This is a reasonably fair sampling, I think, of the findin gs of some 
of the better studies that relate to the factor of scholarship as an 
element df prediction of teaching ^success. This should be said, 
however, in behalf of the findings of Mr. UUman. His correlations 
with the 12 factors studied are generally low. 

< Another factor that has been used as a basis of determining the 
availability of teaching candidates has been that of their professipnal 
record. Again Mr. UUman (18) found a correlation of 0.30 between 
professional courses in education and later teaching success. Doctor 
Whitney (20) found a much higher relationship between teaching 
success and professional marks than with academic marks. He 
found a correlation of 0.143 for professional marks.' 
tC Another factor frequently considered possible as a means of pre- 
dicting teaching success is that of intelligence. However, there seems 
to be general unanimity on the part of those who have made worth- 
while studies at this point that there is not sufficient relationship 
between intelligence and teaching success to warrant confidence in 
intelligence as a predictive trait. Ejiight (8) used the early Thorn- 
dike coUege entrance examination as a measure of the intelligence 
of high-school teachers and found a correlation between inteUigence 
and teaching success as estimated by teachers and supervisors of 0.4 i. 
Somers (17) in his research study for his doctor’s degree found that 
the correlation of intelligence as measured by mental tests and success 
in teaching gave a coefficient relationship of 0.43. Whitney on the 
other hand found a correlation between teaching success after gradua- 
tion with intelligence as measured by intelligence tests of 0.026 (20). 
Possibly the best of the later studies was that made by W. H. Pyle (16). 
He correlated scores on the Detroit Advihced Intelligence test with 
grades Aiade in practice teaching and found a correlation of only 
0.163 ± 0.036. Again intelligence scnres were correlated with the 
teaching of these students after the first year in the public schools. 
Ninety-nine cases used as a basis of the study gave a correlation of 
0.034 and a correlation for the second-year teaching in public schools 
of 0.023. In both situations the probable error was 0.066. Pyle, 
therefore, concludes that intelligence of students has no considerable 
value in predicting the later teaching success as judged by the cri- 
terion of principal's judgments of teaching success. Similarly 
UUman (18), by the use of the Brown psychology test, found a 





correlation of but 0.16. 

Ag ain we have frequently utilized cadet te^hing grades as predic- 
tive of later teaching success and the evidence would seem to suggest 
that our highest correlations are to be found between the practice 
teaching of the proepective teacher and later success in the field. 
Meriam found a conkation of 0.443 (12) between practice teaching 
during the noimal school training and teaching success after gradua- 
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tion. Whitney (20) found a correlation between teaching success 
after graduation with student teaching of 0.238. UUman (18) in 
the study previously referred to found a correlation of 0.36. W. W. 
Ludeman in a study made at Ohio State University reports a correla- 
tion of 0.63 between practice teaching and later teaching success (10). 
These conelations while not high do suggest the value of cadet 
teaching grades as indexes of later success. 

It is evident, however, that cadet teaching, valuable as it is as an X 
index of future teaching success, does not meet the need of a predictive 
device by which to determine the Htness of entering candidates into 
the teacher departmente of our teachers coUeges and universities 
because cadet teaching is usually the last thing which the student 
does before entering hie professional career. 

It is quite evident from a careful study of investigations thus far^ 
pursued that a careful analysis of our criterion has not been a matter ' 
of concern by students of this subject. Most investigators have 
assumed ^e value of their criterion without further investigation. 

To the writer's Imowledge only one or two studies have been made in 
which the criterion has been subjected to careful appraisal. Yet he 
finds himself in full agreement with Doctor Jacobs that in the absence 
of better standards the opinion of competent judges must be accepted. 
Doctor Jacobs (21) succinctly puts his case thus; 


From the work of these investigators and frona readings in related fields, e. g,, 
personnel management, there emer^ the conviction that the most reliable 
criterion at the present time with ^pect to teacher effectiveness is the con- 
scientious and deliberate opinion of competent judges. 

The theory underlying this conolnsion is that where objective measures in 
terms of amount are not fully applicable to determining the degree to which a 
given characteristic or group of characteristics is present in a certain situation, 
then the opinion of competent judges must be resorted to either wholly or in part. 

This hypothesis is the basis for practically all social measurements. Ijt must be 
granted that, as in the case of this study, whUe it is possible to establish the 
reliability oi personal judgments, there is no way of proving their validit^Qt is ^ 
because of this difficulty to prove validity that attempts are constantly being 
^made to find a truly objective 'method of determining teacher effectivenes^l 
The greatest difficulty here, and this Is almost universally overlooked by the 
investigations in the field of teacher rating, is that the product of the teacher's 
^ort really comes to fruition not in a week, nor in a month, nor yet in a year. ^ 
The fact is some 10 to 20 years must pass before the fruition is attaineiQ 
It is true that in the absence of means for measuring the actual product, 
measurements ttmt are predictive of ultimate effectiveness must be resorted to. 

But, if we db resort to such messurement as a method of estimate, we must be 
sure that what we measure is valid as a criterion of estimating. And since being 
sure is a matter of opinion, we are led to the comment that there are certain 
conditiona under which opinion must be accepted as fact. 

Until the time arrives, then, when an objective method for determining the 
comparative effectivenees of different teachers has proven both valid and reliable, 

Um subjective opinion of competent Judges mugt continue to serve the purpose. * 
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However, if we are to find adequate leads for the discovery of 
traits or factors of success it is necessary to subject the criterion to 
careful analysis for what it may yield. 

y During this year we have undertaken to determine the value of 
the criterion by which teachers are adjudged successes or failures in 
the field for the possible purposes of discovering predictive elemente 
that might lead to the development of a predictive test. An 
attempt has been made, therefore, to discover what elements make 
up the total value of the criterion. For this purpose we undertook 
a study of the judgments of superintendents and principals who have 
rated our graduates from the school of education while at work in 
the field. ^ 

III. A STUDY OF THE CRITERION OF TEACHING SUCCESS USED 

The general plan was as follows: For a number of years past the 
teacher placement bureau of the University of Oregon annually has 
sent to the superintendents and principals of secondary schools 
uniform rating blanks upon which to check the success and progress 
of teachers who are graduated from the University of Oregon. As a 
result of this policy over a period of years we had two or more ratings 
for a large number of teachers now in the field. The study began 
with the rating of 248 teachers who had taught two or more years 
and for whom we had received ratifig forms from their superintendents 
and principals. 

Because of the lack of complete and accurate data the number of 
teachers who could be used for this study was reduced to 165 casM. 
These 165 teachers represent 84 school systems which range in size 
from 2-teacher to 54-teacher high schools. For each of these teachers 
we had complete ratings representing two different years. In 93 
cases the teachers were rated by different judges and in 72 cases the 
teachers were rated by the same principal or superintendent. In 
every case a year elapses between the ratings. In 47 cases the teach- 
ers were working in different school systems when they were judged 
the second time. Since it has been our custom to send a rating form 
for each teacher annually sometime in January, the judges had no 
knowledge of previous ratings given to the teacher. The academic 
and professional grade averages for these 165 teachers used for supple- 
mentary purposes in the study were taken from the registrar’s office 
of the Univerrity of Oregon. 

Treaitneni of daia . — Finding reliability of criterion in the treatment 
of the data secured, wo. began the study by an att^pt to estahli^ 
the reliability of our criterion. As mentioned before, the criterion is 
based upon the judgments of suporintendents and principals who 
supervised the woA of the teachers judged for one or more years. 
The ratings were recorded on regular forma sent out by the pUoeipent 
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bu^u and returned to the bureau where they were filed. The 
rating form hu 13 items to bo evaluated by the principal. However, 
because certain items on this rating form were not applicable to the 
study we eliminated items Nos. 10, 11, and 13. Item No. 10 requested ^ 
a judgment of the weakest point and No. 11 ot the strongest point in 
the re^rtoire of the teacher's equipment. No. 13 was a general 
invitation for comments not otherwise su^^ted on the 'rating form. 

The 10 ite^ made the basis for our study were as follows; (1) 
Ability as instructor; (2) success in discipline; (3) industry; (4) ' 

character; (6) personality; (6) personal appearance; (7) health; (8) 

loyalty and cooperation; (9) attitude toward community; (12) g^eneral 
rank. ^ 

For the first nine items the following words were used to rate the ^ 
^degree of success in each, namely: Very best, which was givra a i 
value of 6 points; Extent, 6 points; Good, 4 points; Medium, 3 ^ , 
points; Inferior, 2 points; Failure, 1 p6int. The ^'general rank” . ' 
was item No. 12 on the score card, followed by these 6 adjectives or ' , 
phrases, one of which the judge underscores to indicate his general 
opinion of the teacher's success— Distinctly superior, very good, ^ 
good, average, slightly below Average, poor. In the determination ^ 
of the reliability of these judgments the correlation technique was 
employed. In other words, Hhe ratings of the teachers received the 
first year were correlated with the ratings received the second year. 

The correlation of item No. 12, ''general rank," was first secured 'Z 
independent of all other items on the rating form; The correlation 
between the two sets of judges gave a correlation coeflicient of 0.828 ± 

0.020. A total of ^e judges’ ratings on the first nine items was 
next studied, for which a correlation coefficient of 0.052 ±0.131 was 
found. This would suggest very little agreement on the part of the / 
judges when they consider the different traits on thia rating form but ' \ 
indicates considerable agreement when they consider th^ above 
"general rank "of the teachers. 

Following these two attempts at a general or composite rating a 
study was made to determine the part each of the nina items on the 
rating form played in the total ranking the judges gave the teacher. 

The we igh ted value of each of these nine items was determined by 
the iMta regression equation, using partial multiple correlation 
technique. The values found are as lollowa: 


1. Ability ai instroetor.. 0. 779 

2. SucooM io disdiJine. . 289 

8. loduatry . 210 

4. Character... . 040 

6. Penonality. . 071 


0. Peraonal ^>pearanoe 0. 191 

7. Health 299 

8. Loyalty and cooperation . 191 

9. Attitude toward community. . 008 


lOnai— 82 9 
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From these data it is eTident that “ability as instniotor“ is almost 
as important in tbe eetiinate of the judges as the other eight items 
taken together. It further suggests that “ability as an instructor’^ 
has much the same implication for the judges as item No. 12, “general 
rank.” This point is further emphasized when we note the corre- 
lation coefficient between item No. 1 of one set of ratings and item 
No. 1 for the rating given the following year or a year later. This 
fact is further borne out by a comparison of the correlation coefficients 
between the several items on the one set of ratings compared with 
similar items on the second set. The comparisons for the two years 
are as follows: 


Ability as instructor 

Success in discipline 

Industry - - 

CliATActer 


... r„= 0.792 ±0.1 14 
... m- .241± 101 

... f44“ .201 

Personality - 

Personal appearance 

HfMtlth 


... f«- .201 
... r«- .801 
.... r>T- .492 

Loyalty or cooperation 

Attitude toward community 


.... .002 

.... .023 


The conclusion to bo drawn here is the- same conclusion that is 
drawn by Mr. Knight in his study Qualities Related to Success in ^ 
Teaching, where he found that there was no evidence of ability on 
the part of judges to analyze and weigh the factors that entered into 
the general judgment of teaching ability. His condumons, however, 
were drawn from almost diametrically opposite results, because most of 
hi« correlations on individual factors were high while the relationship 
in certain instances could not be justified on any basis of rationality. 

For example, he nuses the question as to what rhyme or reason there 
is in a correlation of voice with intellect of 0.682 or a correlation of 
voice with accuracy of 0.628. He concludes, therefore, that there 
is no rational basia for the relationdiip between factors as found in 
his study, but that his judges were influenced by the halo effect of 
their general judgments upon specific factors (9). 

As a further check upon the criterion and its possible significance 
for our judgments of prospective teachers we sent to superintendents 
and principals of the 165 cases studied a somewhat dffierent rating 
form which is used by the school of education in determining the 
factors of success of our cadet teachers. This special form is filed 
with the director of supervision at the cloee of the teaching period 
of each cadet along with a grade for the cadet ^hich is represented 
by the general rating given the cadet on this rating sheet. This 
rating sheet was sent out to the supermtendents and principals con- 
cerned some six weeks after the 1631 appomtment bureau rating 

form had been returned to the appointment bureau. 

* 

- ... ,,v I 

irks I ^ _ •- 
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This form contams four major sections designated; (1) Person- 
alitj equipment. (2) Social and professional equipment. (3) 
School management. (4) Teaching skill. Under these four classi- 
fications occur 36 items upon which a judgment was asked. The 
correlation was made between 100 of these forms that had been re- 
turned by the fifth of April with the similar rating forms on file in 
the appointment bureau for these same teachers while cadets. The 
following coefficients of correlation were obtained for the four major 
. items: (1) Personality equipment, rj,- 0.606. (2) Social and pro- 

fessional equipment, rn- 0.616. (3) School management, tm- 0.440. 
(4) Teaching skill, r 4 i- 0.184. 

The sum of the correlations for the four major items on the scale 
gave a total correlation value of 0.476. These figures indicate first, 
that while there is a higher degree of interrelation between the four 
major items on the efficiency record for^ used'^snth our cadet 
teachers and judg;ments in the field, the value for the total rating of 
these items of 0.476 does not give as high results as the correlation on 
the appointment bureau rating form where we secured 0.620 as the 
total ratings of the items studied. We are, therefore, prone to the 
same general conclusion that was reached in the study of appoint- 
ment bureau rating forms, namely, that a general judgment is more 
susceptible of agreein^t between judges than is m~attempt to cor- 
relato^aotore that make up the total judgment. F^irther, we conclude 
that a mm]^e rating device seems to aasui«_greatOT predictive accuracy 
than a more complicated one. 

In ordiwlEhaVwe might further check against the possible values of 
this study as a basis of comparison with other studies we secured 
correlations between the criterion and (1) cadet teaching grades, 
(2) professional educational grades, and (3) all academic grades not 
including grades in professional education courses. The correlations 
of these grades are as follows: ri, — 0.687 ±0.072; ri, =-0.188 ± 0.066; 
rt,-0.172± 0.088. These results are in agreement with the findings 
of other investigators reported in the first part of this paper. 

The results of our stu^, therefore, would indicate that considerable 
confidence may be placed in the cadet teaching grades for prediction 
of success in later teaching but that grades in professional subjects or 
' academio subjects for purposes of prediction are of very doubtful 
value. 

Because most of the studies reported are based upon the use of 
comparatively few cases it seemed desirable to experiment from 
samples — our 166 oases to determine ^e reliability of the criterion^ 
Using the same ratio of difiTerent jud^ against ffie same judge as 
existed for the total of 166 cases, 67 oases were taken at random which 
resulted in a ooirelation coefficient of 0.878 ± 0.092. Another sample 
tjdmn on the same basts but including but 20 cases resulted in a oorre- 
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Ution of 0.90 ± 0.027. These checks serve to give considerable pause 
to the dependence to be placed in correlations where the total number 
of cases is small. It is characteristic of ipost of tbe studies made on 
this topic that the number of cases is relatively small. For instance, 
Doctor Morris in her doctor’s study on Personality Traits and Success 
in Teaching used but 60 cases to validate her study. Ullman bi his 
recent study referred to above used but 1 1,6 cases. Boardman in his 
<Joctor’s study of Professional Tests as Measures of Teaching Effi- 
ciency in Hi gh School (1) had complete data for only 88 tekcbers who 
participated in the study.. Knight in his doctor's ^udy previously 
referred to used three school systems which involved a distribution of 
teachers AS follows : 

School A. — ^Elementary teachers, 53 ; hi^-school teachers, IN 

School B. — ^Elementary teachers, 35; high-school teachers, 13. 

School C. — ^Elementary teachers, 30; high-school teachers, 10. 

However, the number of useful ratings he was able to secure totaled 
but 97. Part of his technique involved the study of correlations in ^ 
each school system. The result was that some of his conclusions were 
based on extrem^y meager data. In a survey of the literature thus 
far but few studies have used a larger number of cjises than employed 
here. The conclusions therefore that we would reach as far as this ' 
" study is concerned are: 

Very few writers in this field have established the reliability of 
their criterion. 

2. Mo|t studies are based on too small a sampling. 

3. General ratings by judges are more reliable than the judgment of 
individual factors that go to make up the general rating. 

4. Rating devices with few items to be scored will give a higher 
rdiability than in the case of rating forms with a laige number of 
items. 

5. The predictive value of cadet teaching grades at the present 
time seems to offer a better basis for the determination of future 
teaching success than any other criterion. 

6. Apparently little confidence caq be placed in academic grades, 
professional grades, or intelligence ratings as a prediction of teaching 
success. 

7. A corollary conclusion to the above would be that an analysis 
of the criterion used in this study does not offer much suggestion as 
to what particular elements should enter into the formation of 
teaching aptitude tests. 

IV. APTITUDE TESTS 


During the last few years with the increasing evidence that no one 
specific factor might he utilized as a means of predicting teaching 
sttooess h number of investigations have been made in an attempt to 
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di^ver a batteiy of traite or factors which taken as a composite « 
n^ht serve as a practical instrument of prediction. The results of 
soma of ^ose studios have found expression in the development of 
apUtude Chets which attempt to give a general prognostic index of I 
the individual's success as a teacher. Among these tests may be 

distinguished roughly iQlg^es. The first more or less general type 
may be repre^ted by^the aptitude test for elementary and high- 
school teachers, by Bathurst, Knight, Rugh, and Telford. This test 
insists of six subteste under the following titles: (1) A professioBal / 
judgment test. (2) A tnt over the theory and practice of teaching. 

(3) A ^t covering reading comprehension. (4) A test of social in- 
formation. (5) A test of school and class management. (6) A test 
of professional Information. 

The validity of the test ie determiaed by the correlation of “the 
swres on this test with the judgments of superintendents and super- y 
visors who had observed the teaching efliciency of each teacher judged ^ 
for at least nearly one school year.” 

This general proc^ure was followed both for the elementary school ' 
teachers and the high-school teachers. A multiple correlation be- 
twMD the judgments and the scores for elementary teachers on the 
various tests was found to be 0.414, and for high-school teachers, 0.54. 

There is evidence that this test has some value as an instrument of 

prognosis. However, until ^e reliabiUty of ^e test has been further 

studied, it should be used with caution. 

A slightly different test but of somewhat general character is the / 
Stanford educational aptitude test devised Jby Milton B. Jensen. ^ 
The test attempts to bo more specific than the aptitude tests for ele- 
inentary and high-school teachers, since it claims to predict the can- 
didatefi fitness for specific aspects of professional education, namely 
(1) a combination of teaching-research, (2) a combination of research- 
administration, and (3) a combination of teaching-administration. 

The test is a rather complex one to score and the teat itself is cast in 
a form of situations to which the candidate must give a judgment. 

These situations are classed under three general headings, or rather’ 
form three separate tests: - ’ 

1. Position preference ratings in which the prospective teScher is 
asked to indicate a preference fojr type positions that might be 
avaOable. 

2. Discipline case problems in which certain judgments are required 
in the solution of the problem set up. 

3. TOs test centers around high^ool activities and again asks 
for a judgment in tenns of certain general situations that appear in 
the normal Mtivities of the school, particularly as it effects com- 
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munity relationship. The author describee the plan by which the 
validity of the test was established as follows: 

Each item of the te*t has been carefully weighted in euch a manner as to give 
its greatest possible contribution to each of the scales. Members of the criterion 
groups (205 men in number) were selected on the basis of ratings by from one 
to seven judges. Maximum values were obtained from these ratings by means 
of statistical procedure involving reliabilities and varinbQitlee of the various 
Judges' ratings and the ntunber of judges rating each individual. From these 
weighted ratings cases were selected in such a manner as to give the best cate- 
gorical selections with the smallest probable errors. Experience with the test 
indicates that coefficients of correlation of from 0.80 to 0.90 may reasonably be 
expected between test scores and ratings such as were used in selecting the 
criterion groups. It was found that differences measiired by the test, are inde- 
pendent of age, sex, professional tr^ning, and professional experience, and that 
they are closely associated with self-ratings by individuals who have had exten- 
sive professional training and experience. The relations]}^ above mentioned 
are shown by the correlations of the tests. , 

This test along with the test by Bathuret, Knight, Hugh, and Tel- 
ford is considered Jmt Max E. Engelhart in his appraisal of standaM- 
ized testa for students of education as the two outstanding teaching 
aptitude tests now available (4). 

Another general aptitude test wfflWi has received considerable 
attention is known as the George Washington Univeraty teachii^ 
aptitude test. This test is divided into five parts: (1) Judgments in 
teaching, (2) reasoning and information concerning school problems, 
(3) comprehension and retention, (4) observation and recall, and (5) 
recognition of mental states from facial expreeaon. J^wever, this . 
test has been subjected to rigorous and experimental l^edures by 
Anna R. Markt, of the National Kindergarten and Elementary Col- 
lege of Chicago, and Prof. A. R. Gilliland, of Northwestern University. 
In brief, they gave.the test to a group of 146 freahmen girls in the 
National Kindergarten and Elementary College with no teaching 
experience and another group of 143 sophomore pihffipf the same 
institution with practice teaching experience of from 36 weeks. 
They found that there was no significance in the relative scores of the- 
two groups although the reliability coefficient for the test was rather 
high. They concluded, however, that the test was a better test of 
mental ability than it was of teaching aptitude. 

A second somewhat distinct type of aptitude test available is the 
Vocational Interest Blank, developed by E. K, Strong, of Stanford 
University. The teat is broken up into eight subtests which attempt 
to discover interest characteristics in a number of different cat^ories. 
Three reactions are possible for each situa^on — liki^ indifference, or 
dialike. The subject is suppoaed to register his first reactions to the 
rituation without permitting time for a |tudied response, Intei^ 
profiles have been establiahed for at least 22 vocations and scoring 
ki^ provided, one of wUch is for teachers. 
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This test is validated by correlstion urith characteristic interests 
of a selected successful group within a given vocation or profession 
as the criterion. Norms are given with the score key for teachers. 

The scale is based on the records of 193 educators, nearly all of whom are mem- 
bers of Phi Delta Kappa (the national professional education fraternity). Approx- 
imately one-half are school administrators, one-third teach educat|<on in normal 
schools and oolites, and on^fourth teach in high schools or grammar schools. 

The third* and first quartiles and median scoree are given with 
grades — step^ A, B, B + , B — , and C — established with the per cent 
of the total distribution of marks given for each grade. Unfortu* 
nately no further data were available to the writer on the reliability 
or validity of the test for teachers. However, Strong in his Manual 
for Vocational Interest Blank, September, 1930, page 10, makes this 
cautious statement : 


It will be some time before the validity of this test can be exactly determined. 
Results so far obtained show that the test haa genuine merit. 


The validity of the test rests finally upon the assumption that 
there are dominant interest characteristics peculiar to vocational and x\ 
professional groups, and that these dominant interests are either 
native or are so early and firmly established in the individual th^t 
at least by the time later adolescence is reached these interests 
have developed fixity or essential permanency. That the permanent 
nature of these dominant interests is accepted by psychologists in the 
vocational field is now quite generally recognixed. The development 
of predictive tests based upon basic intoest factors in the teaching 
profession gives promise of significant results. 

A third type of predictive test devdoped for teachers is best repre- 
sented by Dr. Elizabeth Morris' trait index L test. This test assumes ^ 
to indicate the presence qf certain personal traits considered essential 
to the successful teacher. As Doctor Morris explains : ' 


The concept of leadenhip ia a useful way of designating or referring to thoee 
forms of behavior which include broad intereets, control of feeling, tactful manage- 
ment, readiness and ability to undertake activltiM (often called initiative and 
resourcefulness), oooperativeness, enthusiasm, sympathy, and the like. More- 
over, each of these terms refers to forms of behavior that may contribute to suc- 
cess in teaching (and — in terms of equivalent situations — ^to succem In other 
professions) « Therefore, reactions to a series of situations Involving these , tend- 
encies as they occur in teaching, are indicative of probable success, especially 
in that prtrfession. 

ThU pewt of view is further set forth in these words: 


Fttaonality Js the total blend of reaction tendencies, and these tendencies 
must be measured In terms of definite situations. An individuals' personality to 
a great extent reflects the kind of stimuli which his environment constitutes for 
him. There is some reeognitioD of this Interi^y of outer conditions and inner 
tcndenelee far explandMoa of the suoeeM of a student under one set of conditions 
whereas tiieie was failure under dlffersni eondUioDs. Ssleetlon of tesehers, 
however, wfQ n a brrall y be guided by tbs slsndsid, '* We desire p ew o oe who are 
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most apt to succeed in the varioua probable teaching situationa.'’ It is important, 
therefore, to measure as many significant tendencies of the individual as possiblcy^ 

The general quality of leadership was accepted as a complex trait 
around which centered more specific personal qualities of “resource- 
fulness (including much of ifiventiveness and originality), insight, 
tact, degree of positiveness, and certain emotional attitudes" (13). 
The trait index is composed of five sections. Section I contains a 
list of 56 items for which the subject indicates his likes or dislikes, 
much as in the Strong interest blank, except a range of five rather than 
three “aspects of feeling" may be recorded for each item. Doctor 
Morris suggests that this procedure was based on the rather general 
view, “Tell me what a person likes and I will tell you what kind of a 
person he is." Section II attempts to evaluate the subject's resource- 
fulness, insight, and attitudes. Sections III, IV, and V are designed 
in similar fashion to measure other qualities that enter into the 
composite trait — ^leadership; such as in III, tact, initiative; in IV, 
degree of positiveness of judgment; and in V, characteristic-feeling 
attitudes. 

Doctor Morris foimd a correlation between the trait index L test 
and practice teaching grades of 0.463 ± 0.068 (p. 3). WWle recognizing 
the weaknesses of practice teaching grades as a critericm, the author 
justifies her procedure by a quotation from Doctor Whitney's study 
. of 1924: 

Whitney’s statistical study shows that grades taken from even geographically 
t scattered schools justify the foUowing comment: “The correlation between 
teaching success and student teaching remains the highest correlation when all 
other variables are kept constant.” 

Although Doctor Morris employed the most approved techniques in 
this exhaustive study one can not be greatly impress^ with the value 
of trait index L as a predictive device for the selection of aspirants 
to the teaching profession. In the fiirst place, while validated against 
the criterion of grades in cadet teaching, the reliability wals established 
with only “60 college seniors preparing to teach in high schools, 
selected because various kinds of measures were available for each 
of them" (14). In fairness to the work of Doctor Morris it needs be 
sud that other groups, some much larger in number, were used to 
determine the value of certidn personal traits for use in Uie test, 
^gcqndly, the criterion of cadet teaching success is not a reliable 
X enou^ standard by which to validate a test which involves reactions 
to emotional situations. 'Thirty . Doctor Morris admits that it was 
extremely difficult to discovw cnaracteristic and sharply accentuated 
differentiation of response patterns between groups of recognized 
good and poor teachere. t^allv . the low correlation ot the test 0.463 
does not suggest a pr^ctive^vice of great value. However, there , 
seems to be much of worth suggested in the theory that lies back of the 
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constniction of this Successful teaching does call into play 

certain i^rsonality traits which all are ready to acknowledge though 
these truts can be defined or identified but vaguely. Further research 
in this direction may prove most valuable in the development of 
satisfactory teaching aptitude tests. 

An example of the fourth type of teaching aptitude test Available is ^ 
Coxe-Orieans prognosis test of teaching ability. Except that it is 4 
labeled a predictive measure of teaching aptitude, a discussion of the 
test would seem not in place in this paper. In fact, its provision for 
serious^ discussion may more properly belong to the paper of the 
preceding spe^er. Back of this test is the implicit assumption that 
scholastic achievement is a valid criterion of future teaching success. \/ 
This assumption is definitely set forth by the authors in their Manual ^ 
of Directions, page 4, as follows: 


It is understood that the value of such predictive measures is greater if they 
deal directly with the student’s abUity to teach than with his ability to master 
the work of the tewher-training institution. As data are obtained of the value 
of tbesb meaauTM in predicting success |n teaching they will be made available. 
For the time being the contention is offered that the student who does well in 

the wprk of the teacher-training institution is more likely to be successful in 
teaching. 


t 



The tMt is divided into five major parts with several of the parts 
divided into smaller sections. The plan of the test is essentially 
that of a survey of the subject’s knowledge of facts and theories of 
education. ^ 

Part I. General information test. It presents a point of view and 
acquaintance with generally accep^ opinions relative to education. 

Part II. Professional interest test. This also presents a point of 
view and acquaintance with generally accepted opinions relative to 
education. 

Part III. Lessons in education. Here a series of situations are 
presented and questions to be answered given. 

Part IV. Reading comprehension. Faragraphs are given about 
which the students’ powers of analysiB and understanding are tested. 

Party. Problems of education. This test presents crucial problems 
in American ^ucation an<| then a series of questions secures judgments 
from the subjects on these issues. 

'The test requires three hours to complete, is well organized and 
thorough. It may be valuable as an ^tniment with which to predict 
the academic success of students but u of doubtful value as a measure 
of teaching ap^tude. Even as a predictive measure of academic 
achievement in the normal school the suthon present evidence to 
show that the Term an group test of mental ability is slightly better 
^sn tlie aptitude test as a predictive devibe of achievement success 
in the normal school of New York State. 
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1. Interest in predictive factors of teaching success is of compara- 
tively recent date. Very little had been done prior to 1925. Since 
that time major interest has developed in this field. 

2. In some me'asure the work in this field has paralleled the transi- 
tional development of mental tests and aptitude tests in other voca- 
tions. Two broad trends have been in evidence in most studies of 
teaching: (a) The attempt to discover teacher traits or qualities 
directly related to teaching, and (6) thb attempt to devise ®f e 
somewhat general nature which would measure or predict teaching 
success. 

3. Thus far the studies of traits and factors predictive of teaching 
success have not been reassuring. Intefligence, general scholarship, 
and achievement in professional courses have shown disappc^tingly 
low correlations with later teaching success. Practice teaching 
experience alone has dhown a significant relationship to later teaching 
success. Unfortunately this factor does not have high enough pre- 
dictive value to be us^ with confidence and, since it comre at the 
close of the training period of the prospective teacher is of no apparent 
value in the selecting of those who should enter the professional 
training in education. 

4. the attempt to devise tests of measurement of general teaching 
aptitude appears, in the light of th^'history of mental testing and 
present research results in this field, to hold greatrat promise for the 
future. As yet the aptitude tests available are at beet crude and of 
little value, though two or three are su^estive. 

5. A more serious situation faces the training schools charged with 
respKinsibility for educating aspirants for the profession. Adequate 
curricula can not with confidence be provided until better kpowledge 
is available of those elements of training contributory to success in 
teachiilg. Further careful research would appear to be the prereq- 
uisite to the solution of the problem. 
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